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The  research  goal  for  the  project  reported  here  has  been  to  develop 
enabling  technology  for  building  large  knowledge  bases  designed  to 
support  automated  and  semi-automated  problem  solving  for  scientific  and 
engineering  domains. 

This  report  has  the  following  sections: 

•  Task  Structure  Analysis 

•  Representing  and  Reasoning  about  Devices  and  Causal  Processes 

•  Visual  Reasoning 

•  Reconfigurable  Devices  and  Other  Applications 

•  References 

•  Papers  and  Publications  Supported  by  this  Project 

•  Appendices:  Included  Papers 


Task  Structure  Analysis 

Our  work  on  analyzing  problem-solving  tasks,  much  of  it  preceding  the 
project  reported  here,  has  led  to  the  view  that  there  are  distinct, 
anal3^able  “Generic  Tasks”  (GT’s)  that  occur  naturally  in  problem-solving 
activity,  and  that  largely  determine: 

•  the  forms  of  knowledge  that  will  be  most  useful  to  store  in  a  knovN^edge 
base, 

•  the  problem-solving  strategies  that  will  be  effective  for  using  that 
knowledge  to  solve  problems,  and 

•  the  knowledge  oryanizarinns  rhaf  will  makP  knnwIpHgA  rgtiiwal 
efficient. 
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Some  other  determiners  of  forms  and  organizations  of  knowledge,  and  of 
effective  reasoning  strategies,  are:  the  nature  of  the  available 
computational  resoiurces  for  problem  solving  (parallel  or  serial,  size  of 
short-  and  long-term  memory,  processing  speed,  etc.);  the  forms  in  which 
knowledge  is  most  readily  available;  and  the  forms  of  knowledge  that  most 
readily  support  human  interaction,  including  explanation  of  reasoning 
processes  and  justitication  of  conclusions. 

We  have  distinguished  and  studied  seven  “elementary”  GTs,  with  their 
associated  knowledge  forms  and  reasoning  strategies: 

•  Hierarchical  Classification-  Classify  an  object,  event,  or  situation 
with  respect  to  a  taxonomic  hierarchy.  A  useful  control  strategy  is 
“establish-refine,”  i.e.,  top-down,  prune-or-pursue  navigation  of  the 
hierarchy  of  classification  categories,  where  each  category  is  associated 
with  knowledge  to  support  Hypothesis  Matching. 

•  Hypothesis  Matching-  Recognize  whether  a  concept  applies  to  a  given 
situation,  producing  a  confidence  symbol.  Knowledge  is  usefully 
organized  hierarchically,  along  lines  of  abstraction  of  features.  That  is,  a 
“feature”  may  itself  be  recognized  based  upon  a  further  set  of  features. 
Control  may  be  organized  for  either  hypothesis-driven  or  event-driven 
invocation.  This  generic  task  has  also  been  called  “structiired  matching,” 
“routine  recognition,”  and  “concept  matching  for  relevance.” 

•  Routine  Design-  Construct  a  design  or  plan  according  to  specifications, 
where  a  family  of  design  plans  is  known  in  advance,  but  how  plans  and 
sub-plans  fit  together  must  be  decided  at  run  time.  Useful  control 
strategy:  plan  selection  and  refinement.  Useful  forms  of  knowledge: 
designs,  design  plans  (plans  for  constructing  designs),  plan  steps, 
subplans,  constraints  to  check,  and  backtracking  plans.  [3,4] 

•  Knowledge-Directed  Data  Retrieval-  Retrieve  data  “intelligently” 
using  indirect  inference  if  necessary  (either  for  automated  problem 
solving  or  direct  human  use).  Data-related  concepts  are  active  entities 
that  reside  in  networks  of  semantic  relations.  At  a  basic  level  the 
control  mechanism  is  indirect  inference  (attached  inference  procedures) 
whereby  the  active  concept  that  catches  a  question  (because  it 
specializes  in  a  concept  that  occurs  in  the  question)  knows  %^t  to  do  to 
infer  an  answer.  Specialized  forms  of  attached  inferenc  'ake  advantage 
of  the  semantic  relations  between  the  concepts.  A  part  '  vly 
important  use  of  knowledge-directed  data  retrieval  is  for  “data- 
abstraction,”  the  transformation  of  raw  data  to  forms  more  us^iil  for 
problem  solving,  e.g.,  inferring  normality/abnormality  and  trends.  [33] 
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•  Abductive  Hypothesis  Assembly-  Find  the  best  explanation  for 
some  body  of  data  (constructing  a  composite  hypothesis  if  necessary) 
and  estimate  confidence.  Useful  forms  of  knowledge  include:  knowledge 
for  finding  potential  explainers  (forms  include:  cues,  hierarchical 
classification  knowledge,  causal  knowledge),  hypothesis  matching 
knowledge  for  evaluation  and  screening  of  potential  explainers, 
knowledge  for  determining  explanatory  coverage  (e.g.  causal 
knowledge),  and  interaction  knowledge  for  hypothesis  fragments  (e.g., 
incompatibility).  Useful  control  strategy:  specialized  means-ends  with 
the  goals  of  explaining  as  much  as  possible,  maintaining  consistency, 
and  maximizing  confidence.  [27, 41,  25,  26,  18] 

•  Prediction  by  State  Abstraction-  Predict  the  effects  on  a  system, 
given  changes  to  lower-level  subsystems.  Control  usefully  proceeds 
bottom-up  through  the  hierarchy  of  concepts  associated  with 
subsystems. 

•  Goal  Abandonment-  Decide  which  goal  to  abandon  when  it  is  judged 
that  not  all  active  goals  can  be  achieved.  Useful  forms  of  knowledge: 
precompiled  comparative  priorities,  goal-subgoal  hierarchy.  Useful 
control  strategy:  from  the  points  in  the  goal-subgoal  hierarchy 
representing  the  active  go^s  in  conflict,  inherit  priority  knowledge  from 
active  supergoals. 

Elementary  GTs  form  building  blocks  for  more  complex  reasoning  tasks.  A 
major  goal  of  this  project  was  to  analyze  the  task  structures  of  design, 
diagnosis,  and  redesign  (finding  inadequacies  in  designs,  explaining 
their  emergence,  and  proposing  fixes)  [6,  7, 11].  These  were  selected  for 
special  attention  because  of  the  theoretical  challenges  they  present,  and 
because  systems  for  assisting  with  these  tasks  have  an  enormous  potential 
for  useful  applications.  Analyzing  the  structure  of  problem-solving  tasks 
involves  determining  appropriate  methods  for  accomplishing  them  under 
various  conditions,  an^yzing  the  subtasks  spawned  by  the  methods,  and 
determining  the  characteristics  of  the  various  methods,  including  their 
knowledge  requirements. 

Analyzing  design,  redesign,  and  diagnosis  probl«n-solving  tasks  has  led  us 
to  a  more  general  view  of  task  decomposition  and  the  generation  of 
subtasks  [9].  In  particular  we  have  found  it  to  be  very  useful  to  distinguish 
method  selection  as  an  important  distinctive  activity  which  arises  after  a 
problem  solving  task  or  subtask  has  been  generated.  At  that  time  the 
relevant  available  information  can  be  used  in  selecting  the  most  promising 
method  to  be  used  in  carrying  out  the  task.  The  method  selection  may 


4 


itself  spawn  additional  needs  for  knowledge  and  problem  solutions.  So  the 
recursive  decompositic  of  an  overall  problem  into  task  and  subtask  is  not 
always  fixed,  but  is  a  product  of  the  interaction  of  how  the  problem 
instance,  the  knowledge  available  to  the  system  when  the  task  arises,  and 
perhaps  some  history  of  efforts  at  solving  the  problem. 

A  task  structure  analysis  is  a  funcdonal  decomposition  of  an  information 
processing  task.  Such  an  analysis  identifies  domain-specific  and  domain 
independent  aspects  of  a  task  and  methods  for  its  accomplishment.  The 
task  can  be  feasibly  accomplished  if  a  decomposition  can  be  foimd  such 
that  each  subtask  is  feasible,  and  can  be  combined  with  the  other  subtasks 
by  a  feasible  method.  A  task  can  have  one  or  more  methods  associated 
with  accomplishing  it.  Each  method  is  specified  in  terms  of  how  it  uses 
knowledge  and  inference  to  achieve  its  goals,  and  in  terms  of  v\^at 
subtasks  (subgoals)  it  sets  up  and  requires  to  be  achieved  before  it  can 
succeed.  Alternative  methods  for  accomplishing  a  task  may  make  use  of  a 
common  subtask.  This  kind  of  decomposition  can  be  done  rectirsively  until 
methods  which  achieve  subtasks,  but  which  do  not  set  up  additional 
subtasks  of  their  own,  are  reached. 

We  investigated  how  elementary  generic  tasks  can  be  composed  and 
integrated,  trying  to  combine  the  advantages  of  task-specific  architectures 
(computational  efficiency,  modularity,  knowledge-acquisition,  knowledge¬ 
base  debugging,  knowledge-maintenance)  with  the  flexibility  and 
robustness  inherent  in  generalized  goal-seeking  control  of  problem¬ 
solving.  Various  approaches  to  integrating  basic  generic  reasoning  tasks 
have  been  explored  at  LAIR  in  recent  years.  A  discussion  of  the  issues 
involved  is  given  in  B.  Chandrasekaran,  "Design  problem  solving:  A  task 
analysis"  which  is  included  as  an  Appendix  with  this  report.  Additional 
discussion  is  available  in  [11, 19,  22,  23,  24,  40,  42]. 

Task  Analysis  of  Design 

The  most  common  top-level  family  of  methods  for  design  can  be 
characterized  as  Propose-Critlque-Modify  (PCM)  methods.  These 
methods  have  the  subtasks  of  proposal  of  partial  or  complete  design 
solutions,  verification  of  proposed  solutions,  critiquing  the  proposals  by 
identifying  causes  of  verification  failure,  if  any,  and  modification  of 
proposals  to  better  satisfy  design  goals.  These  subtasks  can  be  combined 
in  fairly  complex  ways,  but  the  following  is  one  straightforward  way  in 
which  a  PCM  method  may  organize  and  combine  the  subtasks. 


•  Step  1.  Given  design  goal,  propose  solution.  If  no  proposal,  exit  with 
failure. 

•  Step  2.  Verify  proposal.  If  verified,  exit  with  success. 

•  Step  3.  If  unsuccessful,  critique  proposal  to  identify  sources  of  failure. 
If  no  useful  criticism  available,  exit  with  failure. 

•  Step  4.  Modify  proposal  and  return  to  2. 

While  all  PCM  methods  will  need  to  have  some  way  to  achieve  the  iteration 
in  Step  4  above,  there  can  be  numerous  variants  on  the  way  the  methods 
in  this  class  work.  For  example,  a  solution  may  be  proposed  only  for  a  part 
of  the  design  problem,  a  part  deemed  to  be  crucial.  This  solution  may  then 
be  critiqued  and  modified.  This  partial  solution  may  generate  additional 
constraints,  leading  to  further  design  commitments.  Thus  subtasks  can  be 
scheduled  in  a  fairly  complex  way,  with  subgoals  from  different  methods 
alternating. 

In  "Design  problem  solving:  A  task  analysis,"  which  is  included  in  the 
Appendix  to  this  report,  a  detailed  discussion  is  provided  of  the  methods 
that  are  available  for  each  of  the  major  subtasks  of  design  problem  solving, 
along  with  an  analysis  of  the  knowledge  requirements  for  each  of  the 
methods  and  control  strategies.  The  This  paper  also  discusses  the 
implications  of  this  analysis  for  the  construction  of  design  knowledge 
bases. 

Task  Analysis  of  Abduction 

The  concept  of  abduction  has  been  used  to  frame  the  problem  of  diagnosis, 
scientific  theory  formation,  natural  language  understanding,  and  is  a  more 
general  framework  than  classification  for  describing  visual  recognition. 
Abduction  or  inference  to  the  best  explanation  is  distinctive  kind  of 
inference  that  goes  from  data  describing  something  to  an  explanatory 
hypothesis  that  best  explains  or  accounts  for  the  data.  Thus  abduction  is  a 
kind  of  theory-forming  or  interpretive  inference.  Abductive  inferences 
have  this  form,  approximately: 

D  is  a  collection  of  data  (facts,  observations,  givens). 

H  (hypothesis)  explains  D  (would  if  true,  explain  D). 

No  other  hypothesis  is  able  to  explain  D  as  well  as  H  does. 


Therefore,  H  is  probably  true. 


The  judgment  of  likelihood  in  the  conclusion  depends  on: 

•  How  decisively  H  surpasses  the  alternatives. 

•  How  good  H  is  by  itself,  independently  of  considering  the 
alternatives . 

•  How  much  confidence  there  is  that  all  plausible  explanations  have 
been  considered  (how  thorough  was  the  search  for  alternative 
e:q)lanations). 

We  can  analyze  the  information-processing  task  of  abduction  to  have  two 
main  subtasks: 

•  subtask  1)  generating  elementary  hypotheses, 

•  subtask  2)  synthesizing  composite  explanations. 

The  subtask,  generating  elementary  hypotheses,  has  two  sub-subtasks: 

•  sub-subtask  1.1)  evoking 

•  sub-subtask  1.2)  instantiating  elementary  hypotheses 

The  sub-subtask  1,2.,  instantiate  elementary  hypothesis,  has  two  sub-sub¬ 
subtasks: 

•  sub-sub-subtask  1.2.1)  scoring  the  elementary  hypotheses 

•  sub-sub-subtask  1.2.2)  determining  explanatory  coverage. 

This  decomposition  is  very  general.  A  typical  abductive  conclusion  is  a 
composite,  and  somehow,  explicitly  or  implicidy,  there  must  be  some 
method  of  choosing  which  elementary  hypothesis  to  consider,  some  way  of 
making  them  specific  to  the  case,  and  some  way  of  accepting  and 
combining. 

A  rich  and  comprehensive  book  on  abducdon  that  describes  research 
progress  made  at  LAIR  over  several  years  has  been  completed,  and  will 
soon  be  published:  Abductive  Inference:  Computation,  Philosophy, 
Technology,  Ed.  J.  and  S.  Josephson,  Cambridge  U.  Pr.  (forthcoming). 

Task'Structure  Analysis  and  the  Construction  of  Technical 
Knowledge  Bases 

In  “Task-Structure  Analysis  for  Knovdedge  Modeling,”  which  is  included  in 
the  Appendix,  we  discuss  in  detail  how  the  task  structure  provides  a  road 
map  for  knowledge  modeling  and  hence  for  knowledge  representation. 

The  generic  methods  in  the  task  structure  require  specific  types  of 
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knowledge,  that  can  be  supported  by  task-  and  method-specific  high-level 
languages. 


Prediction 

We  have  devoted  much  effort  to  identifying  knowledge  useful  for 
prediction,  because  prediction  is  one  of  the  most  pervasive  subtasks 
generated  by  processes  of  design  checking  and  diagnostic  hypothesis 
evaluation.  We  have  been  concerned  with  common-sense  simulation  of 
the  world,  and  its  integration  with  more  technical  and  mathematical 
simulation  methods,  as  would  be  found  in  an  engineer's  problem  solving. 
A  treatment  of  some  of  the  foundational  issues  is  given  in:  B. 
Chandrasekaran,  “QP  is  More  Than  SPQjR.  and  Dynamical  Systems  Theory: 
Response  to  Sacks  and  Doyle,”  which  appeared  in  Computational 
Intelligence,  and  is  included  in  the  Appendix  with  this  report. 


Representing  and  Reasoning  About  Devices  and  Causal 
Processes 

The  “Functional  Representation  Language”  (FR),  was  originally 
introduced  by  Sembugamoorthy  and  Chandrasekaran  (1986)  as  a  device 
representation  from  which  diagnosis-specific  knowledge  could  be  derived 
by  a  process  of  knowledge  compilation.  It  was  subsequently  applied  to 
support  other  tasks  besides  diagnosis  (most  notably  forms  of  qualitative 
simulation),  and  used  to  represent  a  wide  variety  of  device  types. 

Humans  understand  how  the  functions  of  a  device  result  from  causal 
processes  in  which  the  components  and  subsystems  of  the  device  play 
various  roles.  We  have  sought  to  imitate  human  abilities  to  reason 
causally  by  pursuing  “machine  understanding  of  how  things  work." 

Making  distinctions  between  device  structures,  functions,  and  the  causal 
processes  that  support  functions,  is  at  the  heart  of  the  FR,  consequently  we 
have  used  FR  as  a  representational  point  of  departure  for  these 
investigations.  The  main  goals  have  been  to  devise  structures  for 
knowledge  organization  supporting  automatic  and  manual  navigation  of 
multiple  levels  of  function^  and  causal  detail,  for  a  wide  range  of  types  of 
devices  and  causal  processes;  and  to  determine  the  types  of  lmov\dedge 
that  are  needed  to  support  the  problem-solving  abilities  that  normally,  in 
humans,  result  from  understanding  how  a  device  works. 
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FR  represents  devices  (abstract  or  physical),  components,  functions,  and 
causal  processes  -  and  coherentiy  ties  these  representations  together  by 
connecting  device  functionality  to  the  causal  processes  by  which  functions 
are  achieved.  Since  its  introduction  by  Sembugamoorthy  and 
Chandrasekaran,  FR  has  been  developed  and  applied  by  Chandrasekaran, 
Sticklen,  Keuneke,  Josephson,  Goel,  Allemang,  Weintraub,  Punch,  Dejongh, 
Darden,  Moberg,  Iwasald,  and  others.  A  very  diverse  set  of  devices  has 
been  represented  in  FR,  including;  electro-mechanical  devices  [43]  [45], 
biological  processes  [15]  [50]  [40],  non-AI  computer  programs  [2], 
knowledge-based  systems  [51],  manufacturing  fabrication  processes  [48], 
landscape-level  ecological  systems  [49],  logistic  plans  [12],  and  parts  of 
chemical-processing  plants  [30]  [17].  Inference  procedures  have  been 
developed  to  use  these  representations  for  case-based  design  and  redesign 
[17],  automatic  compilation  of  diagnostic  knowledge  [43]  [50],  various 
subtasks  of  diagnosis  [40]  [50]  [1],  program  debugging  [2],  design  criticism 
and  verification  [17]  [20],  prediction  [39],  automatic  software-  hardware 
reconfiguration  [32],  explanation  [12]  [30],  and  control  of  simulation  [45] 
[31]. 

FR  is  complementary  to  the  more  common  “bottom-up”  device 
representations  in  which  the  behavioral  characteristics  of  each  component 
or  process  in  isolation  is  represented,  and  the  behavior  of  the  whole  is 
inferred,  given  information  about  how  components  or  processes  are 
combined.  FR  differs  in  taking  a  more  “top-down”  view  in  which  device 
functions  are  explicitly  represented,  along  with  the  roles  that  components 
and  processes  play  in  achieving  those  functions.  FR  is  especially 
appropriate  for  representing  a  designer's  intent,  but  is  also  suitable  for 
representing  unintended  functions  and  behaviors.  During  this  project  we 
made  progress  in  combining  the  strengths  of  FR  and  component-centered 
representations.  Korda,  Josephson,  and  Moberg  [31,  32]  demonstrated  the 
use  of  an  FR-based  representation  to  direct  simulation  processes  which  are 
based  on  a  VHDL-like  representation.  Iwasaki  and  Chandrasekaran  [20] 
(included  in  the  Appendix)  demonstrated  a  way  to  do  design  verification, 
in  a  device-modeling  environment  called  DME  [21],  by  using  FR  in 
conjunction  with  qualitative  simulation. 

During  this  project  we  enhanced  the  representations  for  device  functions, 
device  structure,  and  causal  processes;  and  devised  and  investigated 
reasoning  strategies  able  to  use  these  representations  to  accomplish 
various  problem-solving  subtasks  supportive  of  design,  diagnosis,  and 
redesign. 

FR  recognizes  the  following  major  datatypes  (classes  and  subclasses): 
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Devices  -  Components  of  devices  are  devices  in  their  own  right.  A  device 
may  have  associated  Ports. 

A  device  will  usually  have  a  description  of  its  structure  -  given  as  a  set  of 
components,  their  associated  ports,  and  a  set  of  CONNECTIONS  between 
ports.  Connections  allow  the  passage  of  Substances,  which  may  be 
material  (e.g.,  water),  or  abstract  (e.g.,  heat  or  information).  Ports  and 
connections  come  in  types  according  to  the  types  of  the  substances  they 
will  pass.  Device  strucmre  may  also  have  other  kinds  of  description  (e.g., 
spatial,  not  just  connection  topology),  but  this  aspect  has  not  been  very 
much  ejqplored  in  the  context  of  FR.  In  principle  a  device  may  have 
separate  descriptions  depending  on  what  perspective  is  taken;  for  .example 
a  wire  may  be  viewed  as  a  perfect  conductor  for  some  purposes,  to  have  a 
constant  but  low  resistance  for  other  purposes,  or  to  have  a  resistance 
which  varies  with  temperature  for  others.  While  we  recognized  the  need 
for  multiple  perspectives,  we  wiil  probably  not  incorporate  it  into  our 
representations  for  some  time,  pending  a  better  theory  of  when 
perspective  switching  is  appropriate. 

STATES  -  A  “state”  is  characterized  by  a  partial  description  of  the  device  or 
its  environment  at  a  moment  or  over  an  interval  of  time.  This  is  usually 
given  as  a  Boolean  combination  of  predicates  over  State-variables 
(which  may  be  discrete  or  continuous).  A  device  has  Modes  which  are 
states  of  the  device.  For  example  a  valve  may  be  FuUy-open,  Partially- 
open,  or  Closed.  Every  device  has  two  major  modes:  Normal,  and 
Abnormal  Known  Malfunction  modes  are  represented  as  sub-states  of 
the  Abnormal  mode.  In  general  Abnormal  means  that  device  behavior 
cannot  be  predicted  according  to  the  specifications  of  the  Normal  mode, 
whether  or  not  it  can  be  predicted  as  belonging  to  some  Malfunction  state 
for  which  a  representation  has  been  given. 

Functions  -  a  device  has  a  set  of  functions  associated  with  each  of  its 
modes.  For  a  Normal  mode  the  functions  may  be  marked  as  INTENDED 
Functions  or  Side  Effects.  Keuneke  ( 1989,1991)  distinguished  four 
types  of  functions:  to  ACHIEVE,  to  PREVENT,  to  MAINTAIN,  and  to 
Control  Iwasaki  and  Chandrasekaran  (1992)  have  added  Allow.  Each 
function  type  is  specified  somewhat  differently.  We  will  focus  here  on 
Achieve  functions,  since  the  representation  for  them  is  the  oldest  and  best 
developed. 

An  Achieve  function  is  associated  with  an  ordered  pair  of  states,  called 
the  IF  STATE  and  the  To-Make  STATE  The  idea  is  that  the  To-Make  state 
is  achieved,  starting  from  the  If  state,  by  using  the  particular  function  of 
the  device.  Often  the  To-Make  state  is  described  by  its  difference  from  the 
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If  state.  To  explain  how  the  To-Make  state  is  ac±ieved,  one  points  to  the 
responsible  device  and  function.  The  If  and  To-Make  states  may  refer  to 
conditions  at  one  or  more  ports;  thus  the  activity  of  transforming  values  at 
input  ports  to  values  at  output  ports  is  a  kind  of  Achieve  function.  This 
allows  FR  to  absorb  structure-behavior  representations  (such  as  VHDL,  but 
there  are  many  others)  that  are  based  only  on  components,  ports, 
connections,  and  mathematical  transformations  from  inputs  to  outputs. 
Sometimes  an  additional  state  is  associated  with  a  function  called  the 
Provided  State  it  is  used  to  specify  conditions  (other  than  those 
specified  by  the  If  state)  imder  which  the  function  can  be  ejqjected  to 
achieve  its  To-Make  state,  e.g.,  standard  operating  conditions.  One  reason 
for  representing  the  Provided  state  is  so  that,  if  it  is  detected  that  the  To- 
Make  state  has  not  been  achieved,  even  though  the  If  state  did  occur,  then 
a  diagnostic  process  will  know  to  check  whether  the  Provided  state 
obtained,  rather  than  simply  concluding  that  the  device  had  malfunctioned. 
Sometimes  a  TRIGGERING  STATE  is  also  associated  with  a  function,  giving 
conditions  under  which  the  device  is  expected  to  immediately  begin  a 
process  to  achieve  the  function, 

Causal-Process  Descriptions  -  Functions  are  accomplished  by  way  of 
causal  processes.  If  the  process  is  known  for  a  particular  function,  a  Causal 
Process  Description  (CPD)  is  associated  with  the  representation  for  the 
function.  The  CPD  is  a  “causal  story”  describing  how  the  function  is 
accomplished.  A  CPD  is  a  directed  graph,  where  the  nodes  are  States  and 
the  edges  are  Annotated  State  Transitions.  The  CPD  may  be  cyclic,  but 
must  include  a  directed  path  from  the  If  state  to  the  To-Make  state  (for 
Achieve  functions). 

Annotated  State-Transitions  -  An  annotated  state  transition  may  be 
used  to  represent  an  actual  change  in  the  represented  device,  or  merely  a 
convenient  change  in  state  description.  In  the  former  case  a  causal 
explanation  would  be  appropriate,  but  in  the  latter  case  an  inferential 
explanation.  For  example  the  transition: 

Temperature  ( location- 1, 150"  C) 

1 

Temperature  (location-1,  abnormally-high) 

might  be  explained  as  an  abstraction  step,  which  would  depend  on 
knowledge  of  normality  conditions. 
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An  annotation  associates  a  data  object  with  the  state  transition;  the  data 
object  may  be  of  any  of  four  types: 

•  CPD  -  The  process  by  which  the  state  transition  occurs  is  described  in 
more  detail. 

•  Inference  -  A  change  of  state  description,  rather  than  an  actual  change 
of  underlying  state.  The  inference  shows  how  the  second  state 
description  follows  from  the  first  one,  given  the  requisite  knowledge. 

•  Function  of  a  Component  -  A  particular  component,  by  performing  one 
of  its  functions,  is  responsible  for  the  state  transition.  FR  thus  provides 
a  kind  of  recursive  decomposition:  functions  are  explained  by  the 
causal  processes  by  which  they  are  achieved,  and  causal  processes  are 
explained  by  the  functions  of  the  components  that  are  responsible  for 
state  transitions  that  make  up  the  causal  process.  This  is  the  way  the 
FR  represents  how  the  functions  of  a  device  arise  from  the  functions  of 
the  device’s  components. 

•  General  Knowledge  principle  -  The  state  transition  can  be 
understood  to  occur  as  the  result  of  some  general  principle,  for  example 
falling  as  a  result  of  gravity,  or  increasing  in  temperature  due  to 
friction.  A  convenient  way  to  represent  general  imowledge  principles, 
especially  scientific  laws,  is  in  the  form  of  mathematical  equations  or 
functions  relating  parameters  of  the  antecedent  state  to  parameters  of 
the  subsequent  one.  When  this  is  done,  the  FR  can  be  used  to  guide 
numerical  simulations:  the  FR  is  used  to  organize  the  set  of  equations 
constituting  the  mathematical  model  of  the  system,  and  then  the 
representation  guides  the  propagation  of  numerical  values  through  the 
set  of  equations.  This  way  the  FR  representation  qualitatively  organizes 
the  causal  dependencies,  with  the  numerical  equations  giving  the 
precise  details  and  the  basis  for  calculations.  This  use  of  FR  has  been 
explored  extensively  by  Sticklen  and  his  colleagues  at  Michigan  State. 

A  state  transition  is  “what  is  accomplished,”  and  its  annotation  explains 
“how  it  is  accomplished.”  Thus  the  transition  represents  a  kind  of  role 
(the  need  to  get  from  the  one  state  to  the  other),  and  the  annotation  points 
to  the  role  filler.  More  than  one  annotation  may  be  given,  corresponding  to 
knowledge  of  more  than  one  “actor”  capable  of  playing  the  role.  (This  way 
FR  can  encode  “multiple  realizability.”)  Besides  its  annotations,  a  state 
transition  may  have  an  associated  Transition  Condition  (a  state)  under 
which  the  transition  can  be  expected  to  occur.  Such  a  condition  can  be 
tested  during  a  simulation  to  determine  whether  the  state  variables  should 
be  updated  (e.g.,  whether  associated  equations  apply  at  that  point). 
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Multiple  state  transitions  are  allowed  to,  and  from,  the  nodes  in  a  CPD.  By 
associating  conditions  with  the  state  transitions,  OR  and  AND  branching  can 
be  represented,  and  used  to  represent  alternative  and  concurrent  causal 
pathways. 

As  we  said,  a  CPD  (causal-process  description)  is  a  directed  graph  whose 
edges  are  annotated  state  transitions.  Inference  procedures  may  traverse 
CPDs  in  either  direction,  forwards  or  backwards.  Traversing  in  the 
forwards  direction,  “consequence  finding,”  moves  from  cause  to  effect,  and 
supports  predictive  inference.  Traversing  backwards,  “antecedent  finding,” 
moves  from  effect  to  cause,  and  supports  abductive  inference.  Design  and 
planning  are  logically  dependent  on  prediction.  Diagnosis  and  process 
monitoring  (any  that  goes  beyond  directly-observables)  are  logically 
dependent  on  abduction.  [By  saying  “logically  dependent”  we  emphasize 
that  the  prediction  and  abduction  do  not  necessarily  happen  explicitly  at 
run  time.]  Since  a  device  represented  in  FR  can  be  considered  an  organized 
assembly  of  CPDs,  each  of  which  is  an  organized  assembly  of  annotated 
state  transitions,  the  device  representation  as  a  whole  supports  prediction 
and  abduction,  and  so  it  supports  design  and  process  monitoring. 

FR  can  be  used  to  represent  any  “mechanism,”  not  just  human-designed 
artifacts.  All  that  is  required  is  that  it  be  possible  to  take  a  “fimctional 
view,”  or  “design  stance,”  towards  the  mechanism.  For  example  Sticklen 
(1987)  used  FR  to  represent  portions  of  the  human  immune  system,  and 
more  recently  to  represent  portions  of  an  ecosystem.  One  could  use  FR  to 
represent  the  causal  processes  by  which  clouds  achieve  their  “function”  of 
producing  rain.  The  mission  of  FR  is  to  represent  “how  it  works  to  do  such 
and  so,”  not  per  se  on  representing  a  designer's  intended  functions,  though 
this  is  an  appropriate  use.  The  distinctive  contributions  of  FR  for  giving 
(part  of)  the  rationale  for  designs  is  described  in  “Functional 
Representation  as  Design  Rationale,”  [13].  A  reprint  of  this  paper  is 
included  as  an  appendix  to  this  report. 

A  mechanism  represented  in  FR  need  not  be  physical;  for  example 
AUemang  ( 1990)  has  used  it  for  representing  computer  programs.  To  the 
degree  that  they  are  understood,  FR  can  be  used  to  represent  the 
mechanisms  that  drive  an  economy  towards  inflation,  and  those  that 
polarize  a  society  towards  civil  war. 

Allemang  showed  how  program  debugging  can  be  guided  by  an  FR 
representation  of  a  program's  intended  functionality  and  method.  Goel 
( 1989)  showed  how  to  use  FR  representations  for  organizing  case  libraries 
of  designs,  and  how  the  cases  can  be  indexed  by  function,  so  a  designer  can 
retrieve  candidate  designs  for  components  or  whole  devices.  Weintraub 
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(1991)  used  FR  to  represent  how  a  particular  knowledge-based  diagnostic 
system  works,  and  showed  how  to  use  the  representation  for  knovsdedge- 
base  refinement  using  explanation-based  learning  from  diagnostic 
mistakes.  Chandrasekaran,  Goel,  and  Iwasaki  (1993)  describe  the  use  of  FR 
for  capttuing  design  rationale.  Darden,  Mobei^  and  Josephson  (1990,1992) 
used  FR  to  represent  a  historical  scientific  theory  in  the  domain  of 
oransmission  genetics,  and  used  it  to  model  scientific  theory  change  as 
‘^anomaly-driven  redesign,”  i.e.,  as  diagnosmg  and  fixing  a  fault  in  the 
theory.  Sticklen  and  his  colleagues  have  been  usmg  FR  for  organizing 
simulations.  Iwasaki  and  Chandrasekaran  (1992)  demonstrated  the  use  of 
FR  for  design  verification.  FR  representations  have  been  used  for 
diagnosis.  Sticklen  (1987)  and  Punch  (1989)  explored  combining  FR-based 
diagnosis  with  “compiled”  diagnosis.  Keuneke  (1989)  showed  how  to  use  a 
functional  representation  to  verify  diagnostic  hypotheses  by  constructing 
causal  stories.  Dejongh  (1991)  used  an  FR  representation  of  antibody- 
antigen  reactions  in  a  SOAR-based  system  for  medical  test  interpretation; 
his  system  used  SOAR's  learning  mechanism  (chunking)  to  incrementally 
“compile”  knowledge  for  hypothesization,  from  causality  represented  in  FR. 

Nothing  in  an  FR-based  representation  constrains  the  causality  to  be 
correct.  One  could  represent  the  causal  mechanisms  posited  by  the 
Aristotelian  theory  of  gravitation,  or  by  a  shaman’s  demonology.  While 
correctness  of  an  FR  representation  cannot  be  guaranteed  from  within  the 
representation,  certain  kinds  of  consistency  can  be  enforced,  e.g.,  that  a 
CPD,  if  it  operates  as  described,  will  indeed  achieve  the  state  required  by  a 
function.  FR  is  intended  to  capture  key  aspects  of  the  “logic  of 
comprehension,”  more  specifically,  important  conceptual  connections  that 
make  up  the  core  of  “causal  understanding.”  FR  is  not  complete  theory. 

Yet  it  has  already  displayed  impressive  functionality  and  versatility,  and 
the  main  path  to  its  improvement  should  be  empirical:  by  applying  it  and 
criticizing  the  results. 


Visual  Reasoning 

It  is  apparent  that  engineering  descriptions  of  devices  and  processes  have 
often  been  expressed  in  the  forms  of  diagrams  and  schematic  pictures. 
Diagrammatic  and  pictorial  representations  clearly  play  important  roles  in 
human  problem  solving,  as  has  been  noticed  by  philosophers,  cognitive 
psychologists,  design  theorists,  logicians  and  AI  researchers.  Philosophers 
have  been  interested  in  the  nature  of  mental  imagery  for  a  very  long  time, 
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and  debates  about  the  reality  and  nature  of  mental  images  and  visual 
representations  have  also  raged  in  Psychology.  Design  theorists  have 
always  been  interested  in  the  role  of  sketches  and  diagrams  as  design  aids. 
Modem  formal  logicians  have  treated  diagrams  as  merely  “heuristic”  aids 
to  be  discarded  once  the  correct  path  to  the  real  proof  is  obtained,  but 
historically  logicians  often  made  use  of  diagrams  of  various  sorts  in 
conveying  and  using  logic.  AI,  while  it  flirted  with  digrams  in  its  early 
decades,  especially  in  the  early  work  on  geometric  theorem  proving,  has 
neglected,  until  quite  recently,  focused  research  on  the  possible  advantages 
of  such  representations.  In  AI  there  is  a  renewed  interest  in  integrating 
symbolic  and  perceptual  represen tations^ 

Diagrams  preserve  locality  information,  or  represent  it  directly.  Several 
visual  “predicates”  seem  to  be  efficiently  computed  by  the  visual 
architecture  from  this  sort  of  information,  e.g.,  neighborhood  relations, 
relative  si2e,  intersections,  and  so  on.  This  ability  makes  certain  types  of 
inferences  easy  or  relatively  direct.  (It  should  be  emphasized  that  only 
some  visual  predicates  are  particularly  easily  computed  by  the  visual 
architecture.  There  exist  numerous  inferences  for  which  there  is 
information  directly  available  in  the  diagram,  but  which  the  visual  system 
is  not  necessarily  good  at  making.  For  example,  given  a  large  circle  and  a 
smaller  circle,  the  visual  system  can  directly  tell  that  one  is  smaller  than 
the  other,  but  given  two  complicated  shapes,  where  one  of  the  shapes  has  a 
smaller  area  than  the  other,  &e  visual  architecture  cannot  compare  them 
directly  or  easily  without  additional  measurements  and  calculation.)  This 
ability  to  make  some  visual  inferences  directly  explains  the  role  of 
diagrams  in  problems  which  are  essentially  spati^,  such  as  geometry 
problems.  Even  here  there  are  interesting  strategies  by  which  the  diagram 
can  be  additionally  manipulated.  Additional  constructions  can  be  made  on 
the  diagram,  enabling  a  new  set  of  visual  inferences  to  be  made  in  the  next 
cycles  of  reasoning.  Also,  symbolic  annotations  are  made  on  the  diagram 
which  enables  a  new  round  of  inferences  to  be  made,  not  by  the  visual 
architecture  but  by  use  of  information  in  the  conceptual  modality.  This 
inference  may  set  up  information  which  may  then  enable  additional  visual 
inferences.  This  extremely  interlaced  sequence  of  visual  and  symbolic 
inference-making  is  what  gives  this  whole  approach  its  powen  each 
modality  makes  the  inferences  it  is  best  suit^  for,  and  then  sets  up 
additional  information  which  makes  it  possible  for  the  other  modality  to 
make  another  set  of  inferences  for  which  it  is  best  suited. 


^Th«  AAAI  Spring  Symposium,  Reasoning  With  Diagrammatic  Representations,  was  held 
at  Stanford  on  March  2S-27,  1992.  The  Symposium  was  organized  by  B.  Chandrasekaran, 
Yumi  Iwasald,  Hari  Narayanan,  and  Herbert  Simon. 
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In  this  project  we  have  investigated  the  role  of  images  and  diagrams  in 
qualitative  reasoning  about  the  physical  world.  Narayanan  and 
Chandrasekaran  (included  in  the  Appendix  to  this  report)  illustrate  how 
much  of  our  commonsense  knowledge  about  how  objects  in  the  world 
behave  under  various  forces  and  collisions  is  actually  in  the  form  of 
abstract  perceptual  chunks  that  directed  the  reasoning  activity.  That  is, 
large  parts  of  both  the  knowledge  and  reasoning  are  in  the  visual  domain. 
Representation  of  object  configuration  diagrams  and  predictive  knowledge 
indexed  by  shapes  as  well  as  visual  events  were  emphasized  as  central 
problems.  This  is  to  be  contrasted  with  much  of  the  current  work  in 
qualitative  physics  which  emphasizes  symbolic  and  axiomatic  reasoning  in 
this  task. 


Reconfigurable  Devices  and  Other  Applications 

Avionics  Design:  Issues,  Methods,  and  Results 

During  the  third  year  of  the  project,  we  tested  our  representations  on  the 
challenges  of  reconfigurable  devices  for  aerospace  domains.  Some  devices 
are  designed  so  that  they  can  be  reconfigured  to  adapt  to  events  and 
changing  conditions.  Reconfigurability  complicates  reasoning  about 
devices,  and  makes  additional  demands  on  device  representations  that 
would  support  such  reasoning.  A  device  may  have  variable  potential 
functional ty  (not  all  of  which  will  be  simultaneously  realized).  Some 
mappings  between  functions  and  the  causal  processes  used  to  attain  them 
are  flexible,  that  is,  device  functions  will  be  achieved  by  different  causal 
processes,  depending  on  configuration  and  circumstances.  We  have  arrived 
at  a  treatment  for  reconfigurable  devices  which  is  at  least  able  to  handle 
the  reconfigurability  present  in  the  range  of  execution  options  available  for 
realizing  specific  aircraft  mission  and  tactical  goals.  We  have  built  a 
prototype  system  that  demonstrates  how  both  quantitative  and  qualitative 
forms  of  simulation  may  be  supported  within  a  unified  model  of  a  device. 
(This  is  described  in  "Representing  function  for  reasoning  about  software- 
hardware  reconfiguration,"  included  in  the  Appendix  to  this  report.)  In 
this  system,  design  time  reconfigurability  tradeoffs  on  capacities  needed 
for  hardware  devices  can  be  explored  using  different  me^ods  of 
prediction. 

Specific  results  on  problem  solving  and  multi-level  causal  modeling  for  the 
aerospace  and  avionics  domain  have  been  reported  in  the  recent  1992 
ann  lal  report,  and  will  here  only  be  briefly  reviewed.  Two  technical 
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reports  are  included  in  the  Appendix  to  this  report:  "Representing  function 
for  reasoning  about  software-hardware  reconfiguration,"  and  “Multilevel 
Causal-Process  Modeling:  Bridging  the  Plan,  Execution,  and  Device- 
Implementation  Gaps.” 

Designing  aerospace  systems  can  potentially  benefit  significantly  from 
multilevel  modeling  that  relates  higher  level  mission  or  tactical  functions 
to  the  implementing  levels  of  supporting  avionics  and  aircraft  system 
subfunctionality.  A  particular  benefit  will  be  in  improved  levds  of  fidelity 
of  test  and  simulation  for  loads  on  underlying  devices.  Such  improvements 
can  lead  to  architectural  and  hardware  support  for  load  balancing 
strategies  discovered  while  in  the  design  critique  and  verification  process. 

A  model  that  relates  high  level  functions  and  requirements  to  lower  level 
systems  is  needed  for  an  integrated  redesign  system  that  can  track  and 
guide  design  tests  and  modifications.  [The  redesign  task,  of  course, 
includes  many  subordinate  tasks  that  are  spawned  as  the  task  is  pursued; 
simulation,  abduction,  classification  and  diagnosis  can  become  appropriate 
during  redesign.] 

Also,  because  the  computational  demands  of  simulation  can  easily  grow 
beyond  capabilities  that  can  reasonably  be  expected  to  exist  in  the  next 
few  decades,  we  have  begun  to  explore  prototype  environments  for 
investigating  the  feasibility  and  value  of  merging  various  methods  of 
simulation  and  prediction.  At  present  a  prototype  has  shown  how  a 
simulation  system  that  aids  hardware  design  can  make  use  of  a  mixuire  of 
methods.  The  system  has  investigated  the  capacities  of  that  hardware  in 
relation  to  constraints  arising  from  the  software  functionality  that  the 
hardware  is  to  support. 

We  selected  aerospace  and  avionics  as  a  domain  to  test  recent  advances  in 
causal  process  description  and  functional  decomposition.  In  previous  and 
concurrent  work,  the  domains  of  process  engineering,  simple  manufactured 
devices,  and  biological  processes  have  been  found  to  be  useful  application 
areas  for  LAIR  causal  modeling  techniques.  One  reason  for  varying 
domains  is  to  ensure  that  the  techniques  have  wide  applicability.  A  second 
is  that  it  is  thereby  likely  that  we  will  arrive  at  a  self-contained  and  useful 
set  of  domain-independent  primitive  constructs,  with  well-defined  modes 
of  coniposition,  specification,  and  instantiation.  The  aerospace  and  avionics 
domain  is  representative  of  emergent  technology  domains  in  v^ch 
information  and  control  systems,  and  human  agents,  interact  in  complex 
ways  to  achieve  useful  functions.  Such  systems  are  difficult  to  design  and 
verify,  and  because  of  their  economic  or  political  Importance,  deserve  to 
benefit  from  advances  in  knowledge  based  system  technology. 
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The  availability  of  VHDL  libraries  of  avionics  devices  has  also  contributed 
to  selecting  avionics  and  aerospace  for  attention.  Trends  in  knowledge 
based  systems  favor  investigation  of  ways  to  ameliorate  the  burden  of 
knowledge  acquisition  by  opportunistically  drawing  upon  the  information 
that  has  been  and  is  being  accumulated  in  some  computationally  accessible 
forms.  It  would  be  clearly  helpful  to  knowledge  based  problem  solving 
technology  if  the  effort  is  made  to  find  more  ways  to  draw  upon 
conventionally  available  data  and  resources.  During  the  final  year  of  the 
project  we  have  investigated  several  points  of  application  of  model-based 
reasoning  technology  in  the  domain  of  the  new  generation  of  avionics 
systems,  especially  the  models  that  involve  explicit  hierarchical 
representation  of  function  and  structure. 

Multilevel  Causal  Process  Models 

The  experience  with  the  Pilot  Associate  (PA)  project  shows  that  it  is 
important  to  consider  how  a  future  knowledge  base  might  support 
knowledge  compilation  for  distributed  PA  modules.  Maintenance  of  a 
technical  knowledge  base  has  been  indirectly  studied  in  the  Learning 
System  for  Pilot  Aiding  (LSPA)  project.  Using  explanation  based 
generalization  techniques,  the  LSPA  project  has  shown  how  new  plans  can 
be  included  in  a  knowledge  base  after  the  plans  are  learned  from  a  record 
of  pilot  responses  in  a  simulated  mission.  Advanced  Avionics  and  the  Role 
of  VHDL  models:  In  a  future  technical  knowledge  base,  it  will  be  important 
to  be  able  to  align  top  level  requirements  with  the  lowest  level 
performance  measurements  on  equipment,  including  avionics  subsystems 
at  various  granularities.  VHDL  models  of  avionics  subsystems  can  used 
to  explore  new  capabilities  of  devices  when  the  new  capabilities  affect 
mission  critical  performance  parameters.  This  exploration  can  occur  even 
in  advance  of  the  actual  production  of  the  devices.  But  to  do  so  will 
require  extensive  modeling  of  the  dependencies  and  interdependencies  as 
they  span  both  abstraction  levels  and  subsystems.  We  have  been 
concerned  with  developing  modeling  techniques  expressive  enough  to 
capture  these  dependencies,  and  to  make  them  available  for  a  wide  range 
of  types  of  problem  solving. 


Multilevel  Functional  Decompositions 

During  the  third  year  of  research,  we  developed  functional  decompositions 
and  causal  process  models  for  specific  real  world  problems  that  would 
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challenge  our  modeling  techniques.  One  goal  was  to  develop  insight  into 
what  organizational  principles  would  be  needed  for  device  modeling  in  the 
aerospace  engineering  domain  that  could  support  design,  troubleshooting, 
tutorial,  and  simulation  uses.  One  objective  was  to  develop  a 
representation  (model)  that  takes  a  top  level  function  (a  mission  function 
with  temporal  features  to  it  and  a  time  course  during  which  different 
subfunctions  are  scheduled)  and  show  a  vertical  slice  of  how  it  gets 
decomposed  into  subfunctions  and  ultimately  into  hardware  components. 

Much  of  our  thinking  about  ordinary  devices  and  how  they  work  involves 
viewing  those  objects  as  contributing  to  various  causal  processes  that  make 
the  device  realize  its  fimctions.  Causation  has  been  called  “the  cement  of 
the  universe,”  and  few  would  contest  the  centrality  of  causal  relations,  and 
the  role  they  play  in  organizing  our  information  about  systems,  events, 
components,  processes,  states,  subsystems  and  functions.  One  organization 
of  our  causal  knowledge  of  a  system,  especially  a  complex  system  such  as  a 
pilot  flying  an  aircraft,  is  from  overall  functionality  down  to  the 
subsystems  with  their  subcomponents  and  their  contributing 
functionalities.  A  functional  decomposition  of  aircraft  flights  is  primarily 
of  value  to  show  how  to  move  from  higher  levels  of  functional  description 
to  lower  levels  of  causal  processes  that  realize  functions  of  interest.  A 
decomposition  in  which  the  functional  parts  are  both  represented  and 
related  then  provides  the  "integrative  glue"  for  subsequent  problem 
solving  across  the  levels.  If  we  want  to  know  how  an  equipment  change, 
for  example,  might  effect  parameters  governing  a  maneuver  for 
accomplishing  some  flight  goal,  the  dependencies  marked  in  a  functional 
decomposition  can  show  us  what  to  look  for,  and  how  to  figure  out  what 
has  changed.  Results  of  using  FR  for  a  multi-level  functional 
representation  connecting  maneuver-level  phenomena  with  hardware- 
level  phenomena  are  reported  in  a  paper  that  is  included  in  the  Appendix 
of  this  report. 

Use  of  Functional  Models  for  Controlling  Simulation  for  Hardware 
Estimation 

We  have  made  considerable  progress  in  developing  tools  and  techniques 
for  carrying  out  "smarter"  simulations  of  software  and  hardware  avionics 
subsystems.  Computer  simulation  of  devices  usually  consumes  significant 
amount  of  computational  time.  Making  simulation  more  efficient  and  goal- 
oriented  will  help  to  overcome  this  problem.  One  objective  of  the  work 
reported  here  is  to  show  how  simulations  may  be  made  more  efficient  by 
blending  functional  representation  (FR)  with  structure-based  models. 
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Our  system,  aimed  at  intelligent  control  of  simulation,  consists  of  two  main 
layers: 

1.  The  upper  layer  generates  a  set  of  higher  simulation  goals  based  on  the 
high  level  tasks  of  the  problem-solver.  Some  subtasks  of  diagnosis,  design 
support,  possibly  combined  with  the  representation  of  the  role  of  the 
device  within  a  more  complex  device,  can  provide  high  level  goals  for 
simulation. 

2.  The  lower  layer  economically  achieves  the  basic  simulation  goals 
generated  by  the  upper  layer.  The  savings  on  simulation  time  are  gained 
by  focusing  on  simulation  goals  while  making  use  of  the  capabilities  of  FR 
to  focus  on  the  relevant  causal  pathways  to  project. 

The  objectives  of  our  investigations  of  task  based  control  of  simulation 
have  been: 

•  to  show  how  FR  can  be  smoothly  combined  with  structure-based 
models  for  simulation  in  a  common  framework. 

•  to  determine  the  advantages  of  combining  the  two,  and  whether  FR  can 
help  control  the  simulation. 

•  to  determine  whether  higher  level  goals  of  the  problem-solver  can  be 
used  to  create  lower  level  goals  for  the  simulator. 

•  to  investigate  the  tradeoffs  between  controlling  simulation  by  FR  and 
simulating  brute-force  by  using  the  structural  model  alone. 

Two  prototype  systems  were  developed  to  accomplish  these  objectives. 

The  initial  results  have  shown  in  principle  that  simulation  of  large  systems 
can  be  done  efficiently  by  meshing  less  computationally  costly  techniques 
with  focused  invocation  of  relevant  computationally  demanding  methods. 
Details  of  the  preliminary  results  are  found  in  a  paper  included  in  the 
appendix  to  this  report. 
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THE  TASK-STRUCTURE  METHODOLOGY 

Design  problem  solving  is  a  complex  activity  involving  a  number  of 
subtasks,  and  a  number  of  alternative  methods  potentially  available  for 
each  subtask.  The  structure  of  tasks  has  been  a  key  concern  of  recent 
research  in  task-oriented  methodologies  for  knowledge-based  systems 
(Clancey.  1985;  Chandrasekaran.  1986;  McDermott.  1988;  Steels.  1990). 
One  way  to  conduct  a  task  analysis  is  to  develop  a  task-structure 
(Chandrasekaran.  1989)  that  lays  out  the  relation  between  a  task, 
applicable  methods  for  it.  the  knowledge  requirements  for  the  methods, 
and  the  subtasks  set  up  by  them.  The  major  goal  of  this  chapter  is  to 
develop  u  task  structure  for  design  as  a  knowledge-based  problem-solving 
activity. 


Design  as  search  in  a  space  of  subassemblies 

Designing  artefacts  that  are  meant  to  achieve  some  functions  within  some 
constraints  is  an  important  class  of  design  with  characteristic  properties 
(Goel  and  Pirolli.  1989).  We  concentrate  on  this  class  of  design  problems 
in  this  chapter. 

For  sufficiently  complex  versions  of  the  design  problem,  a  common 
theme  emerges  for  design  as  a  process;  it  involves  mappings  from  the 


'  This  work  has  evolved  over  a  number  of  years.  Earlier  versions  have  appeared  as  Chapter 
2  ol  Brown  and  Chandrasekaran  (1989)  and  in  Research  in  Engineering  Design  (1989).  I, 
75-Wi.  This  venion  previously  appeared  in  Al  Magazine  (1990).  II.  59-71. 
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space  of  design  specifications  to  the  space  of  devices  or  components 
(often  referred  to  as  mapping  from  behaviour  to  structure),  typically 
conducted  by  means  of  a  search  or  exploration  in  the  space  of  possible 
subassemblies  of  components.  This  is  in  fact  the  origin  of  the  frequent 
suggestion  that  design  is  a  synthetic  task. 

The  design  problem  is  formally  a  search  problem  in  a  very  large  space 
for  objects  that  satisfy  multiple  constraints.  Only  a  vanishingly  small 
number  of  objects  in  this  space  constitute  even  “satisficing”,  not  to  speak 
of  optimal,  solutions.  What  is  needed  to  make  design  practical  are 
strategies  that  radically  shrink  the  search  space. 

Set  against  the  view  of  design  as  a  deliberative  problem-solving  process 
is  the  view  of  design  as  an  "intuitive”,  almost  instantaneous,  process  in 
which  a  design  solution  comes  to  the  mind  of  the  designer.  Artistic 
creations  and  scientific  theories  are  often  said  by  their  creators  to  have 
occurred  to  them  in  this  manner.  Even  when  a  plausible  solution  occurs 
in  this  way,  the  proposal  still  needs  to  be  evaluated,  critiqued  and 
modified  by  deliberatively  examining  alternatives.  That  is,  except  in 
simple  cases,  deliberative  processes  are  still  essential  for  real-world 
design. 

Functions,  constraints,  components  and  relations 

A  designer  is  charged  with  specifying  an  artefact  that  delivers  some 
functions  and  satisfies  some  constraints.  For  each  design  task,  the 
availability  of  a  (possibly  large  and  generally  only  implicitly  specified)  set 
of  primitive  components  can  be  assumed.  The  domain  also  specifies  a 
repertoire  of  primitive  relations  or  connections  that  are  possible  between 
components.  An  electronics  engineer,  for  example,  may  assume  the 
availability  of  transistors,  capacitors,  and  other  electrical  components 
when  he  is  designing  a  waveform  generator.  Primitive  relations  in  that 
domain  are  serial  and  parallel  connections  between  components. 

Of  course,  design  is  in  general  recursive:  if  a  certain  component  that 
was  assumed  to  be  available  is  in  fact  not  available,  the  design  of  that  can 
be  undertaken  in  the  next  round.  However,  the  vocabulary  of  primitive 
components  and  relations  may  be  rather  different  from  those  for  the 
original  device. 

Functions  can  be  expressed  as  a  state  or  a  series  of  states  that  we  want 
the  device  to  achieve  or  avoid  under  specified  conditions.  Functions  may 
be  explicitly  stated  as  part  of  problem  specifications,  or  they  may  be 
implicit  in  the  designer's  understanding  of  the  domain.  An  example  of  an 
implicit  function  in  many  engineering  devices  is  safety:  e.g.  a  subsystem’s 
role  may  only  be  explained  as  something  that  prevents  leakage  of  a 
potentially  hazardous  substance,  and  this  function  may  never  be  stated 
explicitly  as  part  of  the  design  specification  (Keuneke,  1989). 
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Usually,  design  specifications  will  mention,  in  addition  to  desired 
functionalities,  a  number  of  constraints.^  The  distinction  between 
functions  and  constraints  is  hard  to  pin  down  formally;  functions  are 
constraints  on  the  behaviour  or  properties  of  the  device.  It  is,  however, 
useful  to  distinguish  functions  from  other  constraints,  since  the  former  are 
the  primary  reason  why  the  device  is  desired.  Design  constraints  can  be 
on  the  properties  of  the  artefact  (e.g.  “Should  not  weigh  more  than 
on  the  process  of  making  the  artefact  from  its  description 
(manufacturability  constraints),  on  the  design  process  itself  (e.g.  “I  want 
a  design  within  a  week”),  and  so  on.  A  computationally  effective  process 
of  design  is  to  generate  a  candidate  design  based  on  functions  and  then  to 
modify  it  to  meet  the  constraints. 

Definition  of  the  design  task 

Consider  the  following  definition  of  the  design  task.  The  design  problem 
is  specified  by. 

•  a  set  of  functions  (explicitly  stated  by  the  design  consumer  as  well 
as  implicit  ones  defined  by  the  domain)  to  be  delivered  by  an 
artefact  and  a  set  of  constraints  to  be  satisfied;  and 

•  a  “technology”,  i.e.  a  repertoire  of  components  assumed  to  be 
available  and  a  vocabulary  of  relations  between  components. 

The  constraints  may  pertain  to  the  design  parameters  themselves,  to  the 
process  of  making  the  artefact,  or  to  the  design  process.  The  solution  to 
the  design  problem  consists  of  a  complete  specification  of  a  set  of 
components  and  their  relations,  which  together  describe  an  artefact  that 
delivers  the  functions  and  satisfies  the  constraints.  The  solution  is 
expected  to  satisfy  a  set  of  implicit  criteria  as  well,  e.g.  it  is  not  much 
more  complex  or  costly  than  plausible  alternatives  (ruling  out  Rube 
Goldberg  devices). 

The  preceding  definition  also  captures  the  domain-independent  character 
of  design  as  a  generic  activity.  Planning,  programming  and  engineering 
design  all  share  the  above  definition,  as  well  as  many  of  the  subprocesses, 
to  a  significant  degree.  Nevertheless,  there  are  versions  of  the  design 
problem  for  which  the  above  definition  needs  to  be  modified  or  extended. 
Examples  are: 

•  At  the  start  of  the  design  process  only  a  minimal  statement  of 
functions  and  constraints  may  be  available,  and  additional  ones 


'  The  consirainis  that  are  dexribed  as  part  of  the  design  specification  ought  to  be 
distinguished  from  the  term  ’constraint"  that  appean  in  description  of  design  methods,  such 
as  "constraint-directed  problem  solving". 
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may  be  developed  in  parallel  with  the  design  process  itself. 

•  Some  design  problems  involve  extensive  trade-off  studies,  where 
a  part  of  the  design  process  is  search  for  ways  in  which  the 
functions  or  the  constraints  may  be  relaxed  or  otherwise 
modified. 

•  "Tinkering"  is  a  time-honoured  method  of  invention  where  the 
design  space  is  being  explored  without  any  specific  set  of 
functions  in  mind.  Functions  may  be  identified  for  structural 
configurations  that  arise  during  exploration. 

•  The  world  of  primitive  objects  may  be  very  open-ended,  and  only 
implicitly  specified. 

The  design  framework  that  I  will  be'  presenting  can  be  extended  to  cover 
these  variations. 


THE  TASK-STRUCTURE 

Let  us  say  we  have  a  problem-solving  task^  T,  and  let  M  be  some  method 
suggested  for  the  task.  A  method  can  be  described  in  terms  of  the 
operators  it  employs,  the  objects  that  it  operates  on,  and  any  additional 
knowledge  about  how  to  organize  operator  application  to  satisfy  the  goal. 
At  the  knowledge  level,  the  method  is  characterized  by  the  knowledge 
the  agent  needs  to  set  up  and  apply  the  method.  Different  methods  for 
the  same  task  may  call  for  knowledge  of  different  types. 

To  take  a  simple  example,  for  the  task  of  multiplying  two  multidigit 
numbers,  the  “logarithmic  method”  consists  of  the  following  series  of 
operations:  extract  the  logarithm  of  each  of  the  input  numbers,  add  the 
two  logarithms,  and  extract  the  anti-logarithm  of  the  sum.  (The  operators 
are  italicized.  Their  arguments  as  well  as  the  results  are  the  objects  of  this 
method.) 

Note  that  one  does  not  typically  include,  at  this  level  of  description  of 
the  logarithmic  method,  specifications  about  how  to  extract  the  logarithm 
or  the  anti-logarithm,  or  how  to  do  the  addition.  If  the  computational 
model  does  not  provide  these  capabilities  as  primitives,  performing  these 
operations  can  be  set  up  as  subtasks  of  the  method.  Thus,  given  a 
method,  applying  any  of  the  operators  of  a  method  can  be  set  up  as  a 
subtask.  Some  of  the  objects  a  method  needs  may  be  generic  to  a  class  of 
problems  in  a  domain.  As  an  example,  consider  hierarchical  classification 
using  a  malfunction  hierarchy,  a  common  method  of  diagnosis.  Opera¬ 
tions  of  “Establish-Hypothesis"  and  “Refine-Hypothesis”  are  applied  to 
the  hypotheses  in  the  hierarchy.  These  objects  are  useful  to  solve  many 
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instances  of  the  diagnostic  problem  in  the  domain.  If  the  malfunction 
hypotheses  are  not  directly  available,  generation  of  such  hypotheses  can 
be  set  up  as  subtasks.  A  common  method  for  the  generation  of  such 
objects  is  compilation  from  so-called  “deep”  knowledge.  Structure- 
function  models  of  the  device  that  is  being  diagnosed  have  been  proposed 
and  used  as  deep  models  to  generate  malfunction  hypotheses 
(Chandrasekaran  et  al.,  1989). 

There  is  no  finite  set  of  mutually  distinct  methods  for  a  task,  since  there 
can  be  numerous  variants  on  a  method.  Nevertheless,  the  term  “method” 
is  a  useful  shorthand  to  refer  to  a  set  of  related  proposals  about 
organizing  computation. 

Types  of  methods.  One  type  of  method  is  of  particular  importance  in 
knowledge-based  systems:  methods  which  can  be  viewed  as  a  problem- 
space  search  (Newell.  1980).  Designer-Soar  (Steier,  1989)  and  AIR-CYL 
(Brown  and  Chandrasekaran,  1989)  are  examples  of  design  systems  which 
explore  search  spaces.  For  example,  AIR-CYL  can  be  understood  as 
searching  in  a  space  of  parameters  for  the  components  of  an  air-cylinder 
by  using  design  plans  which  propose  and  modify  parameter  values. 

Another  class  of  methods  consists  of  algorithms  which  directly  produce 
a  solution  without  any  search  in  a  space  of  alternatives,  e.g.  producing  a 
set  of  design  parameters  by  numerically  solving  a  set  of  simultaneous 
equations.  Such  algorithms  are  only  available  for  so-called  well-structured 
problems.^  Most  real-world  problems  are  ill-structured,  and  the  role  of 
domain  knowledge  is  to  help  set  up  spaces  of  alternatives  and  to  help 
control  the  search  in  those  spaces. 

A  task  analysis  of  this  type  can  be  continued  recursively  until  methods 
whose  operators  are  all  directly  achievable  (within  the  analysis  frame¬ 
work)  are  reached.  In  the  following  task  analysis  for  design,  1  will 
explicitly  indicate  as  subtasks  only  those  to  which  I  want  to  draw  specific 
attention  in  my  discussion.  Other  operators  may  exist  which  require 
additional  problem  solving  as  well. 


A  TASK-STRUCTURE  FOR  DESIGN:  THE 
PROPOSE-CRITIQUE-MODIFY  FAMILY  OF  METHODS 

The  most  common  top-level  family  of  methods  for  design  can  be 
characterized  as  Propose-Critique-Modify  (PCM)  methods.  These 


*  I  subscribe  to  the  view  that  such  algorithms  are  simply  degenerate  cases  of  search  where 
the  agent  has  sufficient  knowledge  to  make  the  correct  choice  at  each  choice  point.  But. 
pragmatically  speaking,  it  is  best  to  think  of  algorithmic  methods  as  a  separate  type,  since 
implementing  them  does  not  require  supporting  search  in  general. 
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methods  have  the  subtasks  of  Proposal  of  partial  or  complete  design 
solutions;  Verification  of  proposed  solutions;  Critiquing  the  proposals  by 
identifying  causes  of  failure,  if  any;  and  Modification  of  proposals  to 
satisfy  design  goals.  These  subtasks  can  be  combined  in  fairly  complex 
ways,  but  the  following  is  one  straightforward  way  in  which  a  PCM 
method  may  organize  and  combine  the  subtasks. 

Example  PCM  method 

Step  1.  Given  design  goal.  Propose  solution.  If  no  proposal,  exit  with 
failure. 

Step  2.  Verify  proposal.  If  verified,  exit  with  success. 

Step  3.  If  unsuccessful.  Critique  proposal  to  identify  sources  of  failure.  If 
no  useful  criticism  available,  exit  with  failure. 

Step  4.  Modify  proposal  and  return  to  2. 

While  all  of  the  PCM  methods  will  need  to  have  some  way  to  achieve 
the  iteration  in  Step  4  above,  there  can  be  numerous  variants  on  the  way 
the  methods  in  this  class  work.  For  example,  a  solution  may  be  proposed 
only  for  a  part  of  the  design  problem,  a  part  deemed  to  be  crucial.  This 
solution  may  then  be  critiqued  and  modified.  This  partial  solution  may 
generate  additional  constraints,  leading  to  further  design  commitments. 
Thus  subtasks  can  be  scheduled  in  a  fairly  complex  way,  with  subgoals 
from  different  methods  alternating.  It  is  hard  to  identify  a  separate 
method  for  each  such  variation.  The  implications  of  thiS  for  a  design 
architecture  are  discussed  in  the  concluding  sections  of  the  chapter. 

In  this  chapter,  most  of  the  attention  is  devoted  to  the  Proposal  subtask, 
since  most  of  the  design  knowledge,  per  se,  is  used  in  this  subtask.  Every 
task  has  a  default  method:  one  which  uses  compiled  knowledge  to  get  a 
solution  without  any  problem  solving.  This  method  is  practical  only  in 
simple  cases.  Because  this  method  is  potentially  applicable  to  simple 
versions  of  all  tasks,  and  has  no  interesting  internal  structure,  I  will  not 
explicitly  mention  it  in  my  discussion. 

A  task  analysis  should  provide  a  framework  within  which  various 
approaches  to  design  can  be  understood.  I  will  use  selected  examples  of 
existing  AI  systems  to  illustrate  the  ideas,  but  there  will  be  no  attempt  to 
provide  a  survey  of  all  AI  work  on  design. 


Methods  for  proposing  design  choices 

Design  proposal  methods  use  domain  knowledge  to  map  part  or  all  of  the 
specifications  to  partial  or  complete  design  proposals.  Three  groups  of 
methods  can  be  identified: 

•  Problem  decomposition/solution  composition.  In  this  class  of 
methods,  domain  knowledge  is  used  to  map  subsets  of  design 
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specifications  into  a  set  of  smaller  design  problems.  Use  of  design 
plans  is  a  special  case  of  decomposition  methods. 

•  Retrieval  of  cases  from  memory  which  correspond  to  solutions 
for  design  problems  which  are  similar  or  “close”  to  the  current 
problem. 

•  Families  of  methods  that  solve  the  design  problem  as  a  constraint 
satisfaction  problem  and  use  a  variety  of  quantitative  and 
qualitative  optimization  or  constraint  satisfaction  techniques. 

Decomposition  and  case-based  methods  help  reduce  the  size  of  the 
search  spaces,  since  the  knowledge  they  use  can  be  viewed  as  the 
compilation  or  chunking  of  earlier  (individual  or  community)  search  in 
the  design  space.  Conversion  of  a  design  problem  into  one  amenable  to 
global  optimization  algorithms  requires  substantial  a  priori  knowledge  of 
the  structure  of  the  design  problem. 


Decomposition/ solution  composition 

I  will  treat  this  method  in  terms  of  all  the  features  that  an  information 
processing  analysis  calls  for:  types  of  knowledge  and  information  needed 
and  the  inference  processes  that  operate  on  this  form  of  knowledge. 

Knowledge  needed  is  of  the  form  D  -»  Di,  D2,  .  .  .  D„,  where  D  is  a 
given  design  problem,  and  D,s  are  “smaller”  subproblems  (i.e.  associated 
with  smaller  sea  ch  spaces  than  D).  A  number  of  alternative  decomposi¬ 
tions  for  a  problem  may  be  available,  in  which  case  a  selection  needs  to 
be  made,  with  the  attendant  possibility  of  backtracking  and  making 
another  choice.  Repeated  applications  of  the  decomposition  knowledge 
produce  design  hierarchies.  In  well-trodden  domains,  effective  decom¬ 
positions  are  known  and  tittle  search  for  decompositions  needs  to  be 
conducted  as  part  of  routine  design  activity.  For  example,  in  automobile 
design,  the  overall  decomposition  has  remained  largely  invariant  over 
several  decades. 

Decomposition  knowledge  in  design  generally  arises  when  the  func¬ 
tional  specifications  can  be  decomposed  into  a  set  of  subfunctions 
(Freeman  and  Newell.  1971).  Design  decomposition  knowledge  may 
come  in  the  form  of  part-subpart  decomposition,  if  a  direct  mapping  is 
available  between  functions  and  components. 

The  following  are  two  important  subgoals  of  the  Decomposition/ 
Solution  Composition  method. 

•  Generating  specifications  for  subproblems.  The  funaional  and 
other  specifications  on  D  need  to  be  translated  into  specifications 
for  each  of  the  subproblems  D|,  .  .  .  0„. 

•  Gluing  the  subproblem  solutions  into  a  solution  to  the  original 
design  problem. 
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In  most  routine  design,  these  subtasks  are  not  explicit:  either  they  are 
solved  by  compiled  knowledge  or  the  problem  specification  already 
implies  a  solution  to  these  problems.  In  the  general  case,  however, 
additional  problem  solving  is  needed. 

How  a  Decomposition/Solution  Composition  method  might  actually 
organize  and  use  the  subgoals  is  given  by  the  following  example. 

Example  problem  decomposition! solution  recomposition  method 
Step  1.  (Search  in  the  space  of  decompositions.)  Choose  from  among 
alternative  decompositions  for  the  given  design  problem  D. 

Step  2.  Generate  specifications  for  subproblems  in  the  chosen  decom¬ 
position. 

Step  3.  Set  up  each  subproblem  as  a  design  problem.  Solve  them  in  some 
order  determined  by  control  strategies  and  other  domain 
knowledge  (e.g.  progressive  deepening). 

Step  4.  If  subproblems  solved,  Recompose  solutions  of  subproblems  into 
solutions  for  D,  and  exit. 

Step  S.  If  failure  in  Steps  3  or  4,  go  to  Step  1  to  make  another  choice,  or 
relax  specifications  and  go  to  Step  2. 

All  the  caveats  mentioned  in  connection  with  the  PCM  method  earlier 
apply.  Specifically,  control  of  how  subproblems  are  solved  may  be  quite 
variable  and  more  complex  than  indicated  above.  Some  of  the  sources  of 
this  complexity  are  discussed  below. 

Given  a  design  problem,  it  may  not  always  be  possible  to  generate  all 
the  constraints  for  its  subproblems  from  the  original  problem’s  specifica¬ 
tions  alone.  In  many  domains,  constraint  generation  for  some  subproblems 
alternates  with  partial  design  of  others,  which  in  turn  provides  additional 
information  for  constraints  for  yet  other  subproblems.  There  may  be  a 
complex  process  of  commitments  and  backtracking.  In  extreme  cases, 
most  of  design  problem  solving  may  consist  of  search  for  parameters  that 
make  all  the  subproblems  solvable.  For  example,  the  Propose  and  Revise 
method  (Marcus  et  al.,  1985)  involves  making  commitments  to  some 
subparts  of  the  design  problem  (Propose  part)  and  then  Revising  these 
when  some  constraint  for  other  parts  of  the  problem  are  violated. 

In  configuration  tasks  (Mittal  and  Frayman,  1989).,  subproblem 
solutions  are  given  as  part  of  the  problem  (i.e.  the  desired  functions  are 
mapped  into  a  set  of  key  components),  and  the  remaining  task  is 
dominated  by  the  subtasks  of  specification  generation  and  solution  recom¬ 
position.  In  order  for  components  A  and  B  to  be  connected,  certain  pre-  and 
post-conditions  may  need  to  be  satisfied.  If  these  conditions  are  not 
available  a  priori,  they  need  to  be  derived  from  configuration  behaviours. 
Discovery  of  connection  conditions  and  checking  of  whether  specific 
configuration  proposals  result  in  desired  functional  behaviours  can  often  use 
simulation  as  a  problem  solving  method  (e.g.  Kelly  and  Steinberg,  1982). 
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There  can  be  complex  dependencies  between  constraints  among 
subproblems.  In  situations  where  not  only  are  commitments  for  Di  going 
to  constrain  the  specifications  for  D2,  ■  ■  •  D„,  but  the  commitments  for 
the  latter  may  further  specify  constraints  for  Dj  as  well,  a  strategy  that 
Steier  (1989)  has  identihed  as  progressive  deepening  is  a  natural  strategy 
to  emerge.  This  strategy  involves  making  some  commitment  for  each 
subprobiem  at  each  pass,  using  these  commitments  to  generate  additional 
specifications,  undoing  earlier  commitments  as  needed,  and  repeating  this 
process. 

Control  issues.  There  are  two  sets  of  control  issues,  one  dealing  with 
which  sets  of  decompositions  to  choose  (in  Step  1  in  the  example 
Decomposition/Recomposition  method  above),  and  the  other  concerned 
with  the  order  in  which  the  subprobiems  within  a  given  decomposition 
ought  to  be  attacked  (Step  3).  For  the  first  problem,  in  the  general  case, 
the  decomposition  will  produce  an  AND  or  an  OR  node.  The 
decompositions  in  an  AND  node  will  all  need  to  be  solved,  while  for  an 
OR  nc^e  only  one  of  the  decompositions  will  need  to  be  solved.  Finding 
the  appropriate  decomposition  requires  search  in  an  AND/OR  graph.  But 
as  a  rule  such  searches  are  expensive.  In  domains  where  multiple 
decompositions  are  possible  but  there  are  no  easily  formalizable  heuristics 
to  choose  among  them,  the  machine  may  be  effective  in  proposing 
alternatives  while  the  human  evaluates  them  and  makes  a  selection. 

In  routine  design,  extensive  searches  in  the  spaces  of  possible 
decompositions  are  avoided  by  limiting  the  number  of  possible  decom¬ 
positions  at  each  choice  point  to  one  or  very  few.  This  leads  to  the 
availability  of  a  design  hierarchy  for  design  in  that  domain. 

Transformation  methods  (Balzer,  1981)  for  algorithm  synthesis  are 
examples  of  decomposition  methods.  In  this  approach,  a  set  of  high-level 
specifications  of  an  algorithm  is  converted  into  a  series  of  programming- 
language-level  commitments.  This  is  done  by  mapping  subsets  of 
specifications  into  a  “component”  for  which  some  implementation-level 
commitments  have  been  made.  Each  such  commitment  will  typically 
constrain  other  implementation  commitments.  Because  of  this,  search  in 
the  space  of  possible  transformations  may  often  be  needed.  In  most 
implemented  transformation  systems,  humans  choose  from  a  set  of 
alternative  transformations  presented  by  the  design  system. 

Regarding  the  order  in  which  subproblems  in  a  given  decomposition 
are  to  be  atucked,  the  main  constraint  is  knowledge  about  dependencies 
between  subproblems  that  I  just  discussed.  When  the  subproblems  are 
organized  in  the  form  of  a  design  hierardiy,  the  default  control  is  control 
top-down,  but  actual  control  can  be  complicated.  For  example,  a 
component  at  the  leaf  level  of  the  design  hierarchy  may  be  the  most 
limiting  component  and  many  other  components  and  subsystems  can  only 
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be  designed  after  that  is  chosen.  Part  of  the  design  process  in  this  case 
will  appear  to  have  a  bottom-up  flavour.  In  general,  appropriate  control 
strategies  come  about  based  on  the  dependencies  between  subproblems. 

Design  plans.  A  special  case  of  decomposition  knowledge  is  design 
plans,  representing  a  precompiled  partial  solution  to  a  design  goal 
(Friedland,  1979;  Rich,  1981;  Johnson  and  Soloway,  198S;  Mittal  et  al., 
1986;  Brown  and  Chandrasekaran,  1989).  A  design  plan  specifies  a 
sequence  of  design  actions  to  take  for  producing  a  design  or  part  of  a 
design.  Design  commitments  made  by  a  design  plan  may  be  abstract,  i.e. 
choices  are  made  not  at  the  level  of  primitive  objects  but  rather 
intermediate  level  design  abstractions  which  need  to  be  further  refined  at 
the  level  of  primitive  objects.  For  example,  in  designing  an  automobile,  a 
design  plan  may  commit  to  choice  of  diesel  engine  as  the  power  plant. 
While  this  is  a  design  proposal  in  the  sense  that  a  commitment  is  being 
made,  the  diesel  engine  design  itself  is  not  specified  in  detail  at  this  stage, 
but  posed  as  a  subtask  to  be  solved  by  any  of  the  available  methods. 

Thus  a  design  plan  D  may  set  up  other  design  problems  Di, .  .  .,  D„  as 
subproblems,  and,  in  this  sense,  it  is  decomposition  knowledge  in  a  strong 
form:  how  the  main  problem  goals  are  transformed  into  goals  to  be 
allocated  to  subproblems  and  how  the  solutions  to  the  subproblems  are 
put  back  together  for  obtaining  a  solution  to  the  original  design  problem 
are  all  directly  encoded  in  the  plan. 

Design  plans  can  be  indexed  in  a  number  of  ways.  Two  possibilities  are 
to  index  by  design  goals  (for  achieving  <goat>,  use  <plan>).  or  by 
components  (for  designing  <part>,  use  <plan>).  Each  goal  or  component 
may  have  a  small  number  of  alternative  plans  attached  to  it,  with  perhaps 
some  additional  knowledge  that  helps  in  choosing  among  them. 

Control  and  inference  issues  in  the  use  of  plans  are  similar  to  those  in 
the  general  case  of  decomposition:  alternative  plans  are  possible  and.  in 
routine  design,  design  plan  hierarchies  may  emerge.  The  default  control 
strategy  can  be  characterized  as  instantiate  and  expand.  That  is,  the  plan’s 
steps  specify  some  of  the  design  parameters,  and  also  specify  calls  to 
other  design  plans.  Choosing  an  abstract  plan  and  making  commitments 
that  are  specific  to  the  problem  at  hand  is  the  instantiation  process,  and 
calling  other  plans  for  specifying  details  to  portions  is  the  expansion  part. 

A  number  of  additional  pieces  of  information  may  be  needed  or 
generated  as  this  expansion  process  is  undertaken.  Information  about 
dependencies  between  parts  of  the  plan  may  need  to  be  generated  at 
runtime  (e.g.  discovering  that  certain  parameters  of  a  piston  would  need 
to  be  chosen  before  that  of  the  rod),  and  some  optimizations  may  be 
discovered  at  run  time  (e.g.  the  same  base  that  was  used  to  attach 
component  A  can  also  be  used  to  attach  component  B).  NOAH 
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(Sacerdoti.  1975)  is  an  early  example  of  runtime  generation  of 
dependencies  and  optimization. 


Design  proposal  by  case  retrieval 

A  major  source  of  design  proposal  knowledge  is  design  cases — instances 
of  successful  past  design  problem  solving.  Cases  can  arise  from  an 
individual's  problem-solving  experience  or  that  of  an  organization  such  as 
a  design  firm  or  a  design  community.  Cases  can  be  episodic  (i.e.  represent 
one  problem-solving  episode)  or  can  represent  the  result  of  abstraction 
and  generalization  over  several  episodes.  Design  plans  can  be  considered 
to  be  fairly  abstracted  versions  of  numerous  cases. 

Sussman  (1973)  proposed  that  a  design  strategy  is  to  choose  an  already 
completed  design  that  satisfies  constraints  closest  to  the  ones  that  apply  to 
the  current  problem,  and  modify  this  design  for  the  current  constraints. 
Schank  (19^)  has  emphasized  the  importance  of  case-based  problem 
solving  in  general.  Recent  work  on  case-based  reasoning  in  planning  and 
design  (Goel  and  Chandrasekaran,  1989a;  Hammond,  1989)  explores  this 
family  of  methods.  In  case-based  reasoning,  “almost  correct  designs”  are 
obtained  by  searching  a  memory  bank  of  previous  cases  for  a  design  that 
is  similar  to  the  one  that  is  currently  being  sought. 

The  heart  of  case-based  design  proposal  is  Matching:  how  to  choose  the 
design  that  is  “closest”  to  the  current  problem?  Clearly  some  features  of 
the  cases  are  more  important  in  matching  than  others.  Some  notion  of 
prioritizing  over  goals  or  differences,  in  the  sense  of  means-ends  analysis, 
may  be  needed. 

Indexing  of  cases  with  a  rich  vocabulary  of  features  of  the  case  and  the 
goals  it  satisfies  is  a  key  idea  in  case-based  reasoning.  Matching  and 
retrieval  can  be  driven  by  associative  processes  on  these  indices.  Much  of 
the  work  in  case-based  planning  has  used  domain-specific  goals  to  index 
cases.  For  the  problem  of  designing  engineering  artefacts,  the  design 
cases  need  to  be  indexed  in  terms  of  the  output  behaviours  of  interest. 
For  example.  Goel  and  Chandrasekaran  (1989b)  propose  that  design 
cases  be  indexed  using  their  functions.  More  generally,  they  show  how  cases 
can  be  indexed  by  a  causal  representation  that  relates  the  structure  of  the 
device  to  its  function,  and  how  this  method  of  indexing  can  help  in  retrieval. 
Goel  (1989)  has  a  proposal  for  how  matching  and  retrieval  can  benefit 
from  a  principled  representation  for  design  goals,  states  for  the  device,  41 
and  the  substances  the  device  operates  with. 

Case-based  design  proposal  has  a  lot  in  common  with  the  use  of 
analogical  reasoning  in  design.  Maher  et  al.  (1988)  propose  that 
analogical  reasoning  in  design  is  at  the  heart  of  design  creativity. 
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Design  proposal  by  constraint  satisfaction 

Under  fairly  strong  assumptions,  particular  classes  of  design  problems  can 
be  formulated  as  optimization,  constraint  satisfaction  or  algebraic 
equation-solving  problems.  What  is  common  to  all  these  formulations  is 
that  the  solution  lies  in  a  space  determined  by  simultaneous  constraints, 
and  specific  classes  of  computational  algorithms  may  be  available  to 
locate  that  space  directly.  In  particular,  when  the  structure  of  the  design 
is  already  specified,  but  parameters  are  determined  by  the  specifics  of  a 
design  problem,  numerical  or  symbolic  optimization  techniques  may  be 
useful  for  design  proposal.  Linear,  integer  and  dynamic  programming 
techniques  have  been  used  to  solve  design  problems  formulated  in  this 
manner. 

Some  versions  of  the  constraint  satisfaction  problem  can  be  solved  by 
constraint  propagation,  Constraints  can  be  propagated  in  such  a  way  that 
the  component  parameters  are  chosen  to  converge  incrementally  on  a  set 
that  satisfies  all  the  constraints  (Stefik,  1981). 

Formally,  all  design  can  be  thought  of  as  constraint  satisfaction,  and 
one  might  be  tempted  to  propose  global  constraint  satisfaction  as  a 
universal  solution  for  design.  But  unless  knowledge  is  employed  to  reduce 
the  size  of  the  space  (such  as  by  decomposing  problems  into  smaller 
problems),  design  by  constraint  propagation  can  be  computationally 
intractable.  Problem  decomposition  can  create  subproblems  with  sufficient 
small  problem-spaces  in  which  constraint  satisfaction  methods  can  work 
without  excessive  search. 

Verification 

This  subtask  involves  checking  that  the  design  proposal  satisfies  the 
functional  and  other  specifications.  These  are  two  families  of  methods  for 
this: 

•  Attributes  of  interest  can  be  directly  calculated  or  estimated  by 
means  of  domain-specific  algorithms  or  formulae  (e.g.  use  of 
algebraic  formula  to  calculate  total  weight  or  cost,  or  use  of 
finite-element  methods  to  calculate  stress  distribution).  Direct 
calculation  methods  are  not  of  much  interest  from  an  AI  point  of 
view. 

•  Behaviours  of  interest  can  be  derived  by  simulation.  These 
behaviours  can  be  checked  against  requirements. 

Simulation  takes  as  input  a  description  of  the  structure  of  the  system 
and  generates  as  output  the  behavioun  of  interest.  The  methods  used  in 
simulation  should  mirror  the  rules  by  which  the  behaviour  of  assemblages 
of  components  is  composed  from  the  properties  of  the  components. 
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There  are  quantitative  simulation  methods  which  use  equations  that 
directly  describe  the  results  of  this  composition.  These  equations  again 
are  domain-specific.  For  example,  differential  equations  may  be  used  to 
describe  the  behaviour  of  a  teaction  in  a  reaction  vessel.  The  structural 
description  in  a  proposed  design  of  a  reaction  vessel  can  be  translated 
into  parameters  of  the  differential  equation  and  the  equation  simulated  to 
derive  behaviours  of  interest. 

There  are  generic  A1  techniques  for  generating  behaviour  from 
structure  that  could  be  useful  for  simulation.  Qualitative  simulation  (see 
Forbus,  1988.  for  a  survey  of  the  current  state  of  the  art),  consolidation 
(Bylander,  1988)  and  functional  simulation  (Sticklen,  1987)  are  examples 
of  AI  techniques  that  are  available  for  deriving  behaviours  given 
structure.  A  proposed  design  can  be  simulated  under  various  input 
conditions  and  the  behaviour  evaluated.  All  these  techniques  take  as 
input  a  structural  description  and,  using  qualitative  descriptions  of 
component  behaviours  and  rules  of  composition,  mimic  the  operation  of 
the  device  to  produce  qualitative  descriptions  of  behaviour.  Qualitative 
and  quantitative  simulation  may  alternate:  a  qualitative  simulation  may 
identify  behaviours  likely  to  ^  in  unacceptable  ranges  and  a  more 
focused  quantitative  procedure  may  be  used  to  get  more  precise  values. 

Visual  simulations.  Visual  simulation  of  artefacts  is  widely  used  by 
human  designers  in  verification.  Designs  are  imagined,  represented,  and 
communicated  pictorially  in  domains  such  as  architecture  and  mechanical 
engineering,  (^e  Goel  and  Pirolli  (1989)  for  design  protocol  studies 
which  show  the  prevalence  of  images  during  design.)  It  is  clear  that  there 
is  a  need  for  pictorial  representations  and  symbolic  representations  to 
coexist  in  design  systems.  A  major  use  of  imaginal  representations  is  in 
simulation  of  design  proposals,  but  they  play  a  role  as  well  in  making 
design  proposals  by  analogy  with  other  domains.  Little  AI  research  has 
been  done  so  far  on  visual  representations  that  have  the  qualities  needed 
for  pictorial  reasoning  and  imagination  and  that  also  have  the  symbolic 
properties  needed  for  arbitrary  referencing  and  composition  by  parts.  A 
beginning  in  this  direction  is  proposed  in  Chandrasekaran  and  Narayanan 
(1990)  and  use  of  such  representations  for  simulation  is  discussed  in 
Narayanan  and  Chandrasekaran  (1991). 

Critiquing 

Critiquing  is  the  subtask  in  which  causes  of  failure  of  a  design  are 
analysed:  parts  of  the  structure  are  identified  as  potentially  responsible 
for  the  unacceptable  behaviour  or  constraint  violation.  Critiquing  is  really 
a  generalized  version  of  the  diagnostic  problem,  i.e.  a  problem  of 
mapping  from  undesirable  behaviour  to  parts  of  the  structure  responsible 
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for  the  behaviour.  Modification  of  design  can  be  directed  to  these 
candidates.  Of  course,  localization  of  responsibility  for  failure  will  not 
always  work;  the  entire  approach  to  the  design  may  need  to  be  changed. 

What  is  needed  for  criticism  is  information  about  how  the  structure  of 
the  device  contributes  to  (or  is  intended  to  contribute  to)  the  desired 
overall  behaviour.  An  AI  method  that  is  commonly  used  for  this  subtask 
is  dependency  analysis  (Stallman  and  Sussman,  1977).  This  method  is 
applicable  if  explicit  information  is  available  in  the  fmm  of  dependencies, 
i.e.  knowledge  that  explicitly  relates  types  of  constraint  or  specification 
violations  to  prior  design  commitments.  For  example,  if  total  weight  of  a 
proposed  design  is  higher  than  the  wei^t  limit,  domain-specific 
knowledge  is  usually  available  which  identifies  parts  whose  weights  are 
both  sufficiently  large  and  can  be  adjusted.  Dependencies  may  be 
discovered  by  analysing  pre-  and  post-cooditions  of  design  operators.  For 
example,  if  a  certain  output  behaviour  (say,  voltage  in  an  electronic 
device)  of  a  proposed  design  is  excessive,  the  inputs  the  output  stage  can 
be  traced  back  to  identify  which  of  the  components  upstream  may  have 
contributed  to  the  specific  output.  This  analysis  may  use  simulation  as  a 
subtask. 

Most  of  the  proposals  for  critiquing  that  have  been  in  the  case-based 
reasoning  literature  use  domain-specific  critics  and  are  variations  on  pre¬ 
compiled  patterns  of  relating  output  behaviour  to  possible  changes.  The 
approach  of  Goel  (1989)  for  criti^^wtUg  a  design  proposal  is  based  on  a 
functional  analysis  of  the  proposed  design.  If  a  design  proposal  is 
endowed  with  causal  indices  that  explicitly  indicate  the  relation  between 
structure  and  intended  functions,  then  it  is  relatively  easy  to  identify 
substructures  for  modification  (Goel  and  Chandrasekam,  1969a). 

Modification 

Modification  as  a  subtask  takes  information  about  failure  of  a  candidate 
design  as  its  input  and  then  changes  the  design  so  as  to  get  closer  to  the 
specifications.  Basically,  what  is  requited  is  change  of  a  functional  subpart 
of  the  proposed  design,  or  addition  of  components  to  the  proposed 
design,  so  as  to  satisfy  the  design  specifications.  Depending  on  how 
informative  failure  analysis  is  and  what  types  of  knowledge  are  available, 
a  number  of  problem-solving  processes  are  applicable.  Some  of  them  are 
briefly  outlined  in  the  following  paragraphs. 

Modification  may  be  driven  by  a  form  of  means-ends  reasoning,  where 
the  differences  are  ‘‘reduced’’  in  order  of  most  to  least  significant. 
Especially  useful  here  n  knowledge  that  retates  the  desired  changes  in 
behaviour  to  possible  ttructmal  changes  (Goel,  1969). 

A  related  search  approach  is  one  where  modificatiOT  is  done  by  some 
form  of  hill-climbing.  In  this  method,  parameters  ate  changed,  direction  of 
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improvement  is  noted,  and  additional  changes  are  made  in  the  direction 
of  maximal  increment  in  some  measure  of  overall  performance.  This  is 
especially  applicable  where  the  design  problem  is  viewed  as  a  parameter- 
choice  problem  for  a  predetermined  structure  (e.g.  the  Dominic 
system — Dixon  et  at.,  1984). 

Modification  is  straightforward  in  dependency-directed  methods.  Once 
the  dependency  point  is  reached  by  backtracking,  simply  an  alternative 
choice  is  made  from  the  list  of  finite  choices  available. 

Some  systems  that  perform  routine  design  problems  have  explicit 
knowledge  about  what  to  do  under  different  kinds  of  failures.  This 
information  can  be  attached  to  the  design  plans  (DSPL;  Brown  and 
Chandrasekaran,  1989). 

Criticism  may  reveal  the  need  to  add  new  functions.  If  these  functions 
can  be  added  modularly,  i.e.  by  the  creation  and  integration  of  separate 
substructures  that  deliver  the  functions,  the  design  of  the  additional 
structures  can  be  viewed  simply  as  new  design  problems  to  be  solved  by 
all  the  methods  available  for  design.  The  subtasks  of  generation  of 
specifications  for  these  additional  design  problems  and  integration  of  their 
solutions  were  discussed  in  the  section  on  problem  decomposition  and 
solution  recomposition. 


DISCUSSION  OF  THE  TASK-STRUCTURE 

The  task-structure  for  design  described  in  the  preceding  section*  is 
summariied  in  Table  1.  A  task-structure  is  a  description  of  the  task, 
proposed  methods  for  it.  their  internal  and  external  subtasks,  knowledge 
required  for  the  methods,  and  any  control  strategies  for  the  method.  Thus 
the  task  analysis  provides  a  clear  road  map  for  knowledge  acquisition. 
How  the  analysis  can  be  used  to  integrate  the  methods  and  goals  is 
discussed  in  the  following  section. 

Choice  of  methods.  How  are  methods  to  be  chosen  for  the  various 
tasks?  The  following  is  a  set  of  criteria. 

•  Propenies  of  the  solution.  Some  methods  may  produce  answers 
which  are  precise,  while  answers  of  the  others  may  only  be 
qualitative.  Some  of  them  may  produce  optimal  solutions,  while 
others  may  produce  satisficing  ones. 

•  Properties  of  the  solution  process.  Is  the  computation  pragmatic¬ 
ally  feasible?  How  much  time  does  it  take?  Memory? 


’  The  task-sinictuie  described  here  b  inherently  incomplete:  additional  methods 
may  be  identified  for  any  subtask  as  a  result  of  further  research. 
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TabI*  1 


Task 


Methods 


Subtasks 


Design 

Propose 


Specification 
generation  for 
subproblems 


Composition  of 
subproblem 
solutions 


Verify 


Propose,  Critique, 
Modify  family  (PCM) 

Decomposition  methods 
(incl.  Design  Plans) 
and  Transformation 
methods 


Case-based  methods 
Global 

constra  i  nt-satisf action 
methods 

Numerical 

optimization 

methods 


Propose,  Verify 
Critique,  Modify 

Specification 
generation  for 
subproblems 

Solution  of 
subproblems 
generated  by 
decomposition 
(another  set  of 
Design-tasks) 

Composition  of 
subproblem 
solutions 

Match  and  retrieve 
similar  case 


Numerical  or 
Symbolic  constraint 
propagation  methods 


Constraint  propagation.  Simulation  to  decide 

including  constraint  how  constraints 

posting  propagate 

Configuration  methods  Simulation  for 

prediction  of 
behaviour  of 
candidate 
configurations 

Domain-specific 
calculations  or 
simulation 

Qualitative  simulation, 

Consoiidation 

Visuai  simulation 
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Tabta  1  continued 


Task 

Methods 

Subtasks 

Critique 

Causal  behavioural 

analysis  techniques  to 
assign  responsibility 

Dependency-analysis 

techniques 

Modify 

Hill-climbing-like 
methods  which 
incrementally  improve 
parameters 

Dependency-based 

changes 

Function-to-structure 
mapping  knowiedge 

Add  new  functions 

Design  new  function. 
Recompose  with 
candidate  design 

For  each  taak,  there  ia  a  default  ‘compiled  knowiedge'  method  which  has  domain- 
specific  knowledge  to  achieve  it  directly  and  which  is  not  included  above.  For 
subtaaks  such  as  critiquing,  only  families  of  generic  Al  methods  are  indicated, 
without  explicit  indication  of  their  subtasks 

•  Availability  of  knowledge  required  for  applying  the  method. 
For  example,  a  method  for  design  verification  might  require  that 
we  have  available  a  description  of  the  behaviour  of  the  device  as 
a  system  of  differential  equations;  if  this  information  is  not 
available  directly  and  if  it  cannot  be  generated  by  additional 
problem  solving,  the  method  cannot  be  used. 

A  delineation  of  the  methods  and  their  properties  helps  us  to  move 
away  from  abstract  arguments  about  ideal  methods  for  design.  Each 
method  in  a  task-structure  can  be  evaluated  for  appropriateness  in  a  given 
situation  by  asking  questions  reflecting  the  above  criteria.  While  some  of 
this  evaluation  can  take  place  at  problem-solving  time,  much  of  it  can  be 
done  at  the  time  of  design  of  the  knowiedge  S3fstem;  this  evaluation  can 
be  used  to  guide  a  knowledge-system  designer  in  the  choice  of  methods  to 
implement. 

Different  tyi>es  of  methods  may  be  used  for  different  subtasks.  For 
example,  a  design  system  may  use  a  knowledge-based  piobtem-solving 
method  for  the  subtask  of  creating  a  design,  but  use  a  quantitative 
method  such  as  a  finite-element  method  for  the  subtask  of  eviduating  the 
design. 
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IMPUCATIONS  FOR  AN  ARCHITECTURE  FOR  DESIGN  PROBLEM 
SOLVING 

Because  of  the  multiplicity  of  possible  methods  and  subtasks  for  a  task,  a 
task-specific  architecture  that  is  exclusively  for  design  is  not  likely  to  be 
complete:  even  though  design  is  a  generic  activity,  tlKre  is  no  one  generic 
method  for  it.  Further,  note  that  subtasks  such  as  simulation  are  not 
particularly  specific  to  design  as  a  task.  Thus  if  the  knowledge  for  these 
modules  is  embedded  within  a  design  architecture,  either  they  will  be 
unavailable  for  other  tasks  which  require  simulation  as  a  subtask,  or  the 
knowledge  for  these  tasks  will  need  to  be  replicated.  Thus,  instead  of 
building  monolithic  task-specific  architectures  for  such  complex  tasks,  a 
more  useful  architectural  approach  is  one  that  can  invoke  different 
methods  for  different  subtasks  in  a  flexible  way. 

Following  the  ideas  in  the  work  on  task-specific  architectures,  we  can 
support  methods  by  means  of  special-purpose  shells  that  can  help  encode 
knowledge  and  control  problem  solving.  This  is  an  immediate  extension 
of  the  generic  task  methodology  (Chandrasekaran,  1986).  These  methods 
can  then  be  combined  in  a  domain-specific  manner,  i.e.  methods  for 
subtasks  can  be  selected  in  advance  and  included  as  part  of  the 
application  system.  Alternatively,  methods  can  be  chosen  at  runtime  for 
the  tasks  recursively,  based  on  the  criteria  listed  above  in  the  paragraph 
on  choice  of  methods.  For  the  latter,  what  is  needed  is  a  task- 
independent  architecture  with  the  capability  of  evaluating  different 
methods,  choosing  one,  executing  it,  setting  up  subgoals  as  they  arise 
from  the  chosen  method  and  repeating  the  process.  Soar  (Rosenbloom  et 
al.,  1987),  BBl  (Hayes-Roth,  1985)  or  TIPS  (Punch,  1989)  are  good 
candidates  for  such  an  architecture.  This  approach  combines  the 
advantages  of  task-specific  architectures  and  the  flexibility  of  runtime 
choice  of  methods.  Tlie  DSPL-^•-^  work  of  Herman  (1992)  is  an  attempt 
to  do  precisely  this. 

Using  method-specific  knowledge  and  strategy  representations  within  a 
general  architecture  that  helps  select  methods  and  set  up  subgoals  is  a 
good  first  step  in  adding  flexibility  to  the  advantages  of  the  task-specific 
architecture  view.  However,  it  can  have  limitations  as  well.  For  many 
real-world  problems,  switching  between  methods  may  result  in  control 
that  is  too  large-grained.  In  order  to  see  this,  consider  my  earlier 
description  of  a  PCM  method.  The  method  description  calls  for  a  specific 
sequence  of  how  the  operators  of  Propose,  etc.,  are  to  be  applM.  As 
pointed  out  in  my  disci^on  on  the  P(^  method,  numerous  variants  of 
the  method,  with  complex  sequencing  of  the  various  operators,  may  be 
appropriate  in  different  domains.  It  would  be  a  hopeless  task  to  try  to 
support  all  these  variants  of  the  methods  by  method-specific  architectures 
or  shells.  It  is  much  better  in  the  long  run  to  let  the  task-method-subtask 
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analysis  guide  us  in  the  identification  of  the  needed  task-specific 
knowledge  and  let  a  flexible  general  architecture  determine  the  actual 
sequence  of  operator  application  by  using  additional  domain-specific 
knowledge.  The  subtasks  can  then  be  combined  flexibly  in  response  to 
problem-solving  needs,  achieving  a  much  finer-grained  control  behaviour. 
(See  Johnson  et  al.,  1989  for  realization  of  generic  task  ideas  in  Soar.) 

The  task  structure  also  makes  clear  how  “Al-like”  methods  and  other 
algorithmic  or  numerical  methods  can  be  flexibly  combined,  much  as 
human  designers  alternate  between  problem  solving  in  their  heads  and 
formal  calculations.  For  example,  a  designer  may  need  to  make  sure  that 
the  maximum  current  in  a  proposed  circuit  is  less  than  the  limits  for  its 
components,  and  at  that  point  he  may  set  up  current  and  voltage 
equations  and  solve  them.  If  he  finds  that  the  current  in  one  branch  of  the 
circuit  is  more  than  the  permitted  limit,  he  may  go  back  to  critiquing  the 
design  to  look  for  possible  places  to  change  the  design.  The  task-structure 
view  that  I  have  outlined  shows  how  computer-based  design  systems  can 
also  similarly  engage  in  a  flexible  integration  of  problem-solving  and 
other  forms  of  algorithmic  activity.  The  key  is  that  the  top-level  control  is 
goal-oriented,  and  it  can  set  up  subgoals  and  choose  methods  that  are 
appropriate  to  the  subgoal.  If  the  appropriate  method  for  a  subtask  is  a 
numerical  algorithm,  that  method  can  be  invoked  and  executed,  at  the 
end  of  which  control  reverts  to  the  top  level  for  pursuing  other  goals. 


CONCLUDING  REMARKS 

Over  the  last  several  years,  there  have  been  a  number  of  working  systems 
which  perform  some  version  of  the  design  task  in  some  domain.  These 
design  proposals  do  not  always  bring  out  what  is  common  among  the 
different  tasks  of  design.  There  have  also  been  attempts  to  develop 
formal  "first  principles”  algorithms  for  design  that  are  meant  to  cover  all 
types  of  design.  Such  general  algorithms  are,  however,  computationally 
intractable,  and  are  not  particularly  helpful  in  identifying  the  sources  of 
power  and  tractability  in  human  design  problem  solving  in  most  domains. 

The  view  elaborated  here  is  that  there  is  a  generic  vocabulary  of  tasks 
and  methods  that  are  part  of  design,  and  that  design  problems  in  different 
domains  simply  differ  in  the  mixture  of  subtasks  and  methods.  Expertise, 
i.e.  methods,  and  knowledge  and  control  strategies  for  them,  emerge 
over  a  period  in  different  domains  so  as  to  help  solve  the  task  in  a  given 
domain  tractably.  The  key  to  understanding  all  this  is  thus  not  in  a 
uniform  algorithm  for  design,  but  in  the  structure  of  the  task,  showing 
how  the  tasks,  methods,  subtasks  and  domain  knowledge  are  related.  The 
analysis  also  clarifies  the  relationship  between  task-specific  architectures 
and  more  general-purpose  architectures  for  knowledge  systems. 
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Task-S«ruccure  Analysis  for 
Knowledge  Modeling 

'  n  recent  years  there  has  been  increasing  interest  in  describing  complicated  infor- 

I  mation  processing  systems  in  terms  of  the  knowledge  they  have,  rather  than  by 
the  details  of  their  implementation.  This  requires  a  means  of  modeling  the 
knowledge  in  a  system.  Several  different  approaches  to  knowledge  modeling  have  been 
developed  by  researchers  working  in  Artificisd  Intelligence  (AI).  Most  of  these  approaches 
share  the  view  that  knowledge  must  be  modeled  with  respect  to  a  goal  or  task.  In  this  arti¬ 
cle,  we  outline  our  modeling  approach  in  terms  of  the  notion  of  a  task-structure,  which 
recursively  links  a  task  to  alternative  methods  and  to  their  subtasks.  Our  emphasis  is  on 
the  notion  of  modeling  domain  knowledge  using  tasks  and  methods  as  mediating  concepts. 
We  begin  by  tracing  the  development  of  a  number  of  different  knowledge-modeling  ap¬ 
proaches.  These  approaches  share  many  features,  but  their  differences  make  it  difficult  to 
compare  systems  that  have  been  modeled  using  different  approaches.  We  present  these 
approaches  and  describe  their  similarities  and  differences.  We  then  give  a  detailed  descrip¬ 
tion,  based  on  the  task  structure,  of  our  knowledge-modeling  approach  and  illustrate  it  with 
task  structures  for  diagnosis  and  design.  Finally,  we  show  how  the  task  structure  can  be 
used  to  compare  and  unify  the  other  approaches.  ••••••••••••••• 


Knowledge-Based  Systems: 
What  are  they? 

A  knowledge-based  system  (KBS) 
has  explicit  represenutions  of 
knowledge  as  well  as  inference  pro¬ 
cesses  that  operate  on  these  repre¬ 
sentations  to  achieve  a  goal.  An  in¬ 
ference  process  consists  of  a 
number  of  inference  steps,  each 
step  creating  additional  knowledge. 
The  process  of  applying  inference 
steps  is  repeated  until  the  informa¬ 
tion  needed  to  fuinil  the  require¬ 
ments  of  the  problem-solving  goal 
or  task  is  generated.  Typically,  both 
domain  knowledge  and  possible 
inference  steps  have  to  be  modeled 
and  represented  in  some  form. 

(r.  one  sense,  knowledge  is  of 
general  utility — the  same  piece  can 
be  utilized  in  different  contexu  and 
problems;  so,  unlike  traditional 
procedural  approaches,  knowledge 
should  not  tied  to  one  task  or 
goal.  On  the  other  hand,  it  is  diffi¬ 
cult  to  know  what  knowledge  to  put 
in  a  system  without  having  an  idea 
of  the  tasks  the  KBS  will  confront. 
In  spite  of  claims  of  generality,  all 


KBSs  are  designed  with  some  task 
or  class  of  tasks  in  mind.  Similarly, 
they  are  designed  to  be  operational 
across  some  range  of  domains. 
Thus,  a  clear  understanding  of  the 
relationship  between  tasks,  knowl¬ 
edge  and  inferences  required  to 
perform  the  usk  is  needed  before 
knowledge  in  any  domain  can  be 
modeled. 

Background 

Tasks 

The  word  “task”  has  been  used  in 
somewhat  different  senses  in  the 
field,  contributing  to  much  con¬ 
fusion.  For  example.  Wielinga  et  al. 
in  [33]  describe  a  task  as  a  “fixed 
strategy”  for  achieving  a  goal,  im¬ 
plying  that  it  is  a  term  synonymous 
with  a  method  or  a  procedure  spec¬ 
ification.  In  our  original  work  on 
generic  tasks  (CT)  (6),  there  was  a 
conflation  of  the  goal  with  the 
method:  the  GTs  could  be  thought 
of  as  components  of  a  composite 
method  (e.g..  the  goal  of  diagnosis 
is  achieved  by  a  method  composed 
of  “data  abstraction”  and  “classifi¬ 


cation”),  or  they  could  be  thought 
of  as  goals  as  well  (the  GT  called 
“hypothesis  assessment”  had  the 
goal  to  assess  hypotheses).  In  this 
article,  we  use  the  word  “task”  as 
synonymous  with  types  of  problem¬ 
solving  goals:  for  example,  we  call 
diagnosis  a  task,  since  we  want  to 
talk  abstracdy  about  the  family  of 
problems,  all  of  which  are  charac¬ 
terized  by  achieving  the  goal  of 
generating  a  causal  explanation  of 
observed  abnormal  behavior.  Spe¬ 
cifically,  we  want  to  separate  the 
task  from  the  method  used  to 
achieve  goals  of  this  type. 

The  Knowledge  Level 
Newell’s  Knowledge  Level  (KL) 
framework  [27]  is  very  useful  for 
describing  intelligent  systems  with¬ 
out  becoming  bogged  down  in  acci¬ 
dental  features  of  implementation 
languages  (see  [20]  in  this  issue  for 
additional  discussion  and  examples 
of  the  knowledge  level).  Much  of 
the  discussion  in  the  Held  has  been 
vitiated  by  too  premature  a  com¬ 
mitment  to  a  symbol-level  repre- 
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sencation  (e.g.,  whether  the  repre¬ 
sentation  will  be  rules  or  frames, 
and  whether  backward-  or  for¬ 
ward-chaining  will  be  employed). 
Newell  proposed  that  problem¬ 
solving  agents  can  be  characterized 
in  terms  of  the  knowledge  and 
goals  that  can  be  attributed  to  them, 
and  the  Principle  of  Rationality  by 
means  of  which  intelligent  agents 
can  be  assumed  to  use  knowledge 
relevant  to  their  goals.  Thus,  in  dis¬ 
cussing  a  diagnostic  system, 
whether  it  is  implemented  as  a  rule- 
based  system  or  a  connectionist  net¬ 
work,  we  can  talk  about  the  task  (or 
goal)  as  diagnosis  of  a  certain  type, 
and  can  identify  the  knowledge 
content  of  the  system  in  two  ways: 
as  knowledge  about  the  set  of  mal¬ 
functions,  and  knowledge  that  aids 
in  mapping  from  observations  to 
the  malfunctions.  Hence,  at  the 
knowledge-modeling  level,  we  re¬ 
late  the  task  to  the  types  of  knowl¬ 
edge  needed  to  accomplish  it.  We 
can  then  make  additional  imple¬ 
mentation  commitments  which  will, 
in  turn,  give  us  additional  con¬ 
straints  on  the  forms  of  knowledge. 

The  knowledge-level  view  does 
nut  include  a  specific  account  of 
how  the  problem  will  be  solved  (i.e., 
it  does  not  indicate  the  representa¬ 
tions  and  inference  methods  uti¬ 
lized  to  accomplish  the  task).  How¬ 
ever,  the  knowledge-level  view  can 
be  applied  recursively;  that  is,  some 
commitment  to  an  inference 
method  can  be  made  fairly  ab¬ 
stractly:  then  this  too  can  be  repre¬ 
sented  as  knowledge  at  the  knowl¬ 
edge  level.  This  process  can  be 
repeated  until  the  knowledge-level 
description  includes  some  deKrip- 
tion  of  the  strategies  as  well.  Elach 
commitment  to  a  method  requires 
some  commitment  to  how  the  prob¬ 
lem  will  be  solved,  but  not  as  de¬ 
tailed  a  symbol-level  commitment 
as  is  normally  done  when  a  pro¬ 
gramming  language  such  as  rules 
<ir  frame  languages  is  employed. 

To  see  how  the  KL  is  used,  con¬ 
sider  how  it  can  be  applied  to  de¬ 
scribe  MYCIN,  a  KBS  for  seleaing 
therapies  for  bacterial  infectitms  of 
the  blood  (33).  At  the  highest 
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knowledge  level  the  goal  of  MYCIN 
is  to  select  a  therapy.  The  knowl¬ 
edge  required  to  do  this  maps  signs 
and  symptoms  to  therapies.  At  the 
next  level  we  can  reapply  the  KL 
and  say  that  MYCIN’s  therapy  goal 
is  accomplished  by  first  identifying 
the  bacterial  infection  present,  then 
selecting  a  therapy  for  that  infec¬ 
tion.  Hence,  we  break  the  top  goal, 
therapy,  into  two  subgoals,  diagno¬ 
sis  and  selection.  At  this  level  we 
can  be  more  specific  about  the  types 
of  knowledge  required  for  the  task. 
Diagnosis  requires  knowledge¬ 
mapping  signs  and  symptoms  to  an 
infection.  Selection  requires  knowl¬ 
edge  mapping  infections  and  pa¬ 
tient  data  to  a  therapy.  In  such  a 
way,  we  can  continue  to  apply  the 
KL  until  the  system  is  specified  in 
sufficient  detail  to  allow  its  imple¬ 
mentation.  In  MYCIN,  each  of  the 
subtasks  is  implemented  in  a  back¬ 
ward-chaining  rule-based  system. 
The  point,  however,  is  that  an  accu¬ 
rate  KL  description  of  MYCIN 
hides  implemenution  details  of  the 
system. 

There  are  many  ways  to  specify 
an  information  processing  system 
without  describing  implementation 
details.  Examples  include  the 
knowledge  level,  abstract  algorithm 
specifications  and  Marr’s  informa¬ 
tion  processing  level  (22).  The  se¬ 
lection  of  an  information  process¬ 
ing  description  depends  largely  on 
the  system  being  described  and  the 
purpose  of  the  description.  The  KL 
is  primarily  designed  for  use  in  de¬ 
scribing  intelligent  agents;  hence  it 
describes  an  information  process¬ 
ing  system  as  having  goals,  actions 
and  bodies  of  knowledge. 

Background  Work  in  Knowledge 
Modeling 

Some  of  the  earliest  work  in  knowl¬ 
edge  modeling  was  done  as  part  of 
the  rule-based  system  approach.  In 
this  approach,  the  agent's  knowl¬ 
edge  was  viewed  mainly  as  direcdy 
available  recognition  knowledge 
(i.e.,  knowledge  that  indicates  ex¬ 
actly  what  to  do  in  a  situation).  The 
knowledge  modeling  scheme  was 
simply  to  list,  as  knowledge,  the 
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condition-action  rules  by  which  a 
system  behaves.  Simple  condition- 
action  statements,  in  which  the  con¬ 
ditions  match  the  current  situation 
and  the  aaions  add  to  or  modify 
that  situation,  are  termed  produc¬ 
tion  rules.  However,  this  level  of 
description  does  not  indicate  the 
real  control  struaure  of  the  system 
at  the  task  level.  For  example,  the 
fact  that  R1  [23]  performs  a  linear 
sequence  of  subtasks  is  not  explic- 
idy  encoded;  the  system  designer 
“encrypted,”  so  to  speak,  this  con¬ 
trol  in  the  pattern-matching  of 
OPS5,  the  pr^uction-rule  system 
in  which  R1  is  implemented. 

Another  early  knowledge¬ 
modeling  scheme  was  based  on 
frames.  Frames  were  often  pro¬ 
posed  to  be  at  the  “knowltldge 
level,”  since  they  supposedly  were 
used  for  representing  objects  in  the 
domain  and  their  relations,  a 
“deeper”  level  of  representation 
than  production  rules.  The  knowl¬ 
edge-level  idea  behind  frames  is 
that  they  capture  stereotypical 
knowledge;  this  idea,  however,  is 
not  sufficient  for  modeling  control 
knowledge  at  the  task  level.  The 
problem  is  that  frames  and  frame 
languages  do  not  provide  a  task- 
level  vocabulary  for  modeling  con¬ 
trol  knowledge.  When  frame  lan¬ 
guages  were  used,  control  of  a  sys¬ 
tem  was  often  described  at  a 
syntactic  level:  for  example,  in 
terms  of  which  links  to  pursue  for 
inheritance. 

The  problem  was  that  during  the 
first  decade  of  knowledge-based 
systems  research,  the  discussion  was 
almost  entirely  in  terms  of  the  sym¬ 
bol  level:  in  the  rule-based  para¬ 
digm  all  the  problenu  were  posed 
as  issues  at  the  rule  architecture 
level.  Very  little  discussion  took 
place  at  the  level  of  the  relation  be¬ 
tween  the  task  for  which  the  system 
was  being  designed  and  the  kinds 
of  knowledge  needed.  For  exam¬ 
ple,  a  major  research  issue  for  rule- 
based  systems  was  the  development 
of  an  appropriate  domain-  and 
task-independient  cotifhct  nsohition 
strategy  that  would  let  the  system 
choose  which  produaion  rule  to 
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fire  when  multiple  rules  matched. 
When  the  knowledge  is  viewed  at 
the  appropriate  level,  however,  we 
can  often  see  the  existence  of  orga¬ 
nizations  of  knowledge  that  bring 
up  only  a  small,  highly  relevant 
body  of  knowledge  without  any 
need  for  conflict  resolution. 

The  first  set  of  insights  regard¬ 
ing  the  analysis  of  knowledge  sys¬ 
tems  at  one  level  removed  from 
their  implementations  came  from  a 
number  of  sources.  Gomez  and 
Chandrasekaran  identified  classifi¬ 
cation  as  a  common  element  in  di¬ 
agnosis  [13],  and  Mittal  and 
Chandrasekaran  added  data  ab¬ 
straction  as  another  common  ele¬ 
ment  [23].  This  work  led  direcdy  to 
a  knowledge-modeling  scheme  in 
the  form  of  generic  task  (GT)  lan¬ 
guages  [6].  A  generic  task  identifies 
a  task  of  general  utility  (such  as 
classification),  a  method  for  doing 
the  task  and  the  kinds  of  knowledge 
needed  by  the  method.  The  lan¬ 
guage  is  made  of  primitives  that 
allow  the  required  knowledge  to  be 
directly  described  for  any  domain 
in  which  the  task  can  be  performed. 
Chandrasekaran  and  hb  colleagues 
identified  a  number  of  such  generic 
tasks  [6].  They  also  showed  by  ex¬ 
ample  how  complex  tasks  such  as 
diagnosis  could  be  decomposed 
into  such  generic  tasks  [9]. 

Hierarchical  classification  [3,  9] 
was  the  first  generic  task  to  be  iden¬ 
tified  and  will  serve  as  a  good  ex¬ 
ample  of  the  task-based  approach 
that  we  and  other  researchers  have 
been  developing.  The  task  of  hier¬ 
archical  classification  is  the  identifi¬ 
cation  of  an  object  based  on  a  set  of 
features.  For  example,  diagnosis 
can  be  viewed  as  classification  in 
which  the  input  is  a  set  of  manifes¬ 
tations  and  the  output  b  the  dis¬ 
order  associated  with  these  mani- 
fesutions.  The  name  of  the  generic 
task,  hierarchical  classification,  al¬ 
ludes  to  the  method  identified  to 
solve  the  task.  The  method,  called 
Estabtish-Refine,  assumes  the  exb- 
lence  of  a  classification  hierarchy  of 
output  categories.  In  the  case  of 
medical  diagnosis,  the  hierarchy 


contains  diseases.  More  general 
classes  of  diseases  are  located  at  the 
top  of  the  hierarchy:  more  specific 
diseases  are  located  near  the  bot¬ 
tom.  The  method  operates  by  first 
attempting  to  estaUbh  (i.e.,  con¬ 
firm  to  a  certain  level  of  confi¬ 
dence)  the  topmost  category.  If  this 
can  be  established,  it  b  then  re¬ 
fined;  its  successors  become  cate¬ 
gory  hypotheses  for  the  system  to 
consider  next.  Categories  that  can¬ 
not  be  estabibhed  are  not  refined; 
hence  the  hierarchy  below  these 
categories  does  not  need  to  be  ex¬ 
plored.  The  generic  task  descrip¬ 
tion  clarifies  the  control  structure 
and  knowledge  of  a  classification 
system.  Instead  of  describing  the 
system  in  terms  of  rules  or  frames, 
we  can  describe  the  system  in  terms 
of  categories,  category  evaluation 
and  refinement.  The  knowledge 
required  to  use  hierarchical  classifi¬ 
cation  is  made  explicit:  knowledge 
must  be  available  to  test  and  refine 
categories.  The  control  of  the  sys¬ 
tem  is  explicitly  described  at  a  task 
level — categories  are  evaluated  and 
then  (if  necessary)  refined — rather 
than  at  the  impiemenution  or  sym¬ 
bol  level. 

Somewhat  near  this  time,  Clan- 
cey  had  identified  “heurbtic  classi¬ 
fication”  [11]  as  a  somewhat  ab- 
straa  pattern  of  inference  implicit 
in  MYCIN  (see  Figure  1).  Heuristic 
classification  itself  was  presented  as 
independent  of  the  rule  language 
in  which  MYCIN  was  written  so 
that  this  higher-level  inference  pat¬ 
tern  could  be  seen  independent  of 
the  rule-level  representation.  Clan- 
cey’s  approach  is  similar  to  the  GT 
approach,  having  identified  a  task 
(classification),  a  method  (heuristic 
classification)  and  the  kinds  of 
knowledge  needed  to  use  the 
method.  In  faa,  the  three  infer¬ 
ences  in  heuristic  classification  (see 
Figure  I)  can  be  interpreted  as 
three  subtasks.  MDX  [9],  the  ge¬ 
neric-task  diagnosis  system,  also 
incorporated  the  same  task  decom¬ 
position.  Data  abstraction  was  done 
using  an  intelligent  database;  heu¬ 
ristic  match  was  done  by  esublbh- 
ing  categories  in  the  classification 


hierarchy,  and  refinement  was 
done  during  classification.  The  dif¬ 
ference  between  the  two  ap¬ 
proaches  lies  in  the  explicit  identifi¬ 
cation  by  Clancey  of  this  combined 
inference  structure  as  heuristic 
classification,  whereas  Chandrase¬ 
karan  et  al.  had  broken  this  struc¬ 
ture  down  into  its  components. 

McDermott  and  associates 
started  investigating  the  roles  of 
knowledge  in  various  methods  for 
several  tasks  [24].  Their  goal  was  to 
develop  programs  that  could  auto¬ 
matically  acquire  knowledge  from  a 
domain  expert.  To  do  this  they  de¬ 
veloped  roU-limiting  methods  for 
solving  general  tasks,  such  as  the 
cover-and-differentiate  method  for 
diagnosb  and  the  propose-and- 
revbe  method  for  design.  Role- 
limiting  methods  are  “methods  that 
strongly  guide  knowledge  collec¬ 
tion  and  encoding  [24].”  They  pro¬ 
ceeded  to  specify  the  roles  various 
types  of  knowledge  play  in  the  op¬ 
eration  of  each  method.  The  major 
difference  between  the  role-limit¬ 
ing  method  approach  and  most  of 
the  other  approaches  discussed 
here  b  the  requirement  that  a  role- 
limiting  method  be  completely 
specified  (i.e.,  that  all  tasks  and  sub^ 
tasks  be  prespecified  down  to  the 
level  of  primitive  operations). 

Musen  investigated  ways  to 
model  classes  of  planning  problems 
and  the  required  domain  knowl¬ 
edge  [26].  He  advocated  the  devel¬ 
opment  of  a  task  model  followed  by 
the  use  of  the  model  to  acquire 
domain  facts.  Hb  system.  Protege, 
provides  a  language  for  modeling 
classes  of  planning  problems  based 
on  skeletal  plan  refinement.  Once 
modeled,  Pmeg6  created  a  knowl¬ 
edge  editor  domain  which  experts 
could  interaa  with  to  build  KBSs 
that  solved  problems  in  the  plan¬ 
ning  class. 

Gruber  and  Cohen  investigated 
task  models  as  mediating  represen¬ 
tations  of  knowledge  acquisition  for 
a  diagnostic  task  [16].  They  con¬ 
structed  MU,  a  task-specific  archi¬ 
tecture  for  building  application 
programs  that  do  prospective  diag¬ 
nosb.  A  companion  system,  called 
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pt^ure  I.  Inference  structure  for 
heuristic  classlflcatton.  Adapted  from 
(111. 


.\SK  for  “Acquisition  of  Strategic 
Knowledge."  interacts  with  an  ex¬ 
pert  to  acquire  knowledge  for  MU- 
based  systems.  Because  ASK  is  writ¬ 
ten  speciFically  for  MU,  it  knows 
about  the  strategy  and  types  of 
knowledge  needed  for  the  usk; 
hence  .ASK  can  interact  with  an 
expert  at  the  task  level  rather  than 
at  a  lower  implementational  level. 

In  Europe,  Wielinga  and 
Breuker  proposed  a  set  of  primitive 
terms  in  which  to  model  the  tasks 
that  expert  systems  perform  and.  in 
turn,  the  use  of  these  terms  as  a 
modeling  language  to  capture  the 
knowledge  in  the  domain  [1).  Their 
methodology,  called  KADS.  advo¬ 
cates  a  bottom-up  approach  to  ex¬ 
pert  knowledge  modeling  where 
the  knowledge  modeler  begins  with 
an  expert  verbal  protocol,  models 
this  with  primitive  terms,  then 
builds  higher  levels  of  analysis  on 
top  of  the  primitive  model.  This  is 
quite  different  from  the  methodol¬ 
ogy  behind  GTs  in  which  the 
knowledge  modeling  takes  place  in 
a  top-down  fashion  by  matching 
known  generic  tasks  to  the  task 
being  modeled. 

.More  recently.  Steels  has  pro¬ 
posed  work  along  lines  that  build 
on  the  notion  of  tasks  and  task 
structures  (34  j.  In  his  formulation, 
the  task  structure  is  intended  to 
specify  the  task/subtask  decomposi¬ 
tion  of  a  complex  task  such  as  diag¬ 
nosis.  There  is  a  clear  recognition 
that  the  subtasks  of  a  task  depend 
on  the  method  used  for  the  task. 
For  example,  a  task  such  as  diagno¬ 
sis  might  be  done  using  a  classifica¬ 
tion  methexi.  ClassiHcation  specifies 


additional  subtasks,  such  as  evaluat¬ 
ing  and  refining  hypotheses.  The 
task  structure  notion  of  Steeb  does 
not  explicitly  represent  alternate 
methods  for  each  of  the  tasks;  in¬ 
stead  it  b  a  tree  of  tasks  and  sub¬ 
tasks,  with  the  method  chosen  im¬ 
plicit  in  the  analysb.  Thus  a  given 
task  structure  implicitly  assumes 
the  choice  of  some  method  for  a 
given  task.  Therefore,  it  b  a  good 
tool  for  the  description  of  how  a 
particular  knowledge  system  solves 
the  task  for  which  it  b  intended. 
The  nodon  of  the  task  structure  we 
will  develop  later  in  thb  article  ex¬ 
plicitly  represents  the  methods  for 
each  task,  which  then  provides  a 
framework  for  the  dynamic  selec¬ 
tion  of  methods  at  run  time. 

These  approaches  share  two 
important  features.  First,  they 
identify  tasks  at  various  leveb  of 
abstraction  above  the  implementa¬ 
tion  language  level.  Second,  they 
identify  types  of  knowledge  and 
strategies  closely  associated  with 
such  tasks.  This  is  the  key  point  in 
knowledge  modeling;  once  such 
terms  are  identified,  we  have  a  lan¬ 
guage  in  which  to  model  the  knowl¬ 
edge  in  the  domain  and  the  strate¬ 
gies  to  solve  the  problem.  The 
terms  in  the  vocabulary  can  be  used 
to  encode  knowledge,  mediate 
knowledge  acqubition  [4]  and  pro¬ 
vide  suiuble  explanations  [10], 

NMd  for  uniform  Framework 

In  spite  of  the  last  decade  having 
seen  a  clear  consensus  in  favor  of 
task-level  analyses  and  the  advan¬ 
tages  they  offer  for  knowledge 
modeling  and  acqubition,  con¬ 
fusion  remains  regarding  the  fol¬ 
lowing: 

I.  Distinctions  between  tasks  and 


methods  and  how  complex  generic 
tasks  and  simple  generic  tasks  are 
related, 

2.  The  great  variety  of  knowledge¬ 
modeling  terms  that  have  been  pro¬ 
posed  without  any  simple  way  to 
map  between  them, 

3.  Avoiding  overdetermination 
and  rigidity  in  the  ways  various 
tasks  are  performed  in  the  various 
proposab.  That  b,  we  need  to  show 
how  these  various  task-level  ideas, 
at  various  grain  sizes,  can  be  com¬ 
bined  flexibly. 

To  overcome  these  problems,  we 
develop  the  notions  of  a  task, 
method,  subtask,  and  the  concept 
of  a  task  structure.  The  task  struc¬ 
ture  b  a  uniform  task-level  analysb 
framework  for  describing  systems. 
By  viewing  the  various  task-level 
approaches  in  terms  of  the  task 
structure  we  can  begin  to  compare 
the  approaches  and  also  unravel 
the  current  confusions. 

TIM  Task  structure 

The  Task  Structure  b  the  tree  of 
tasks,  methods  and  subtasks  ap¬ 
plied  recursively  until  tasks  are 
reached  that  are  in  some  sense  per¬ 
formed  direedy  using  available 
knowledge.  Figure  2  graphically 
represenu  pan  of  the  task  struaure 
for  diagnosb.  A  task  (as  we  defined 
earlier)  b  a  problem  type,  such  as 
diagnosis.  Tasks  are  represented 
graphically  using  circles.  A  method 
b  a  way  of  accompibhing  a  task. 
These  are  represented  graphically 
using  recungles.  In  the  figure. 
Bayesian  Explanation,  Abductive  As¬ 
sembly  and  Cover-and-differentiaie  are 
idenufled  as  methods  for  diagnos¬ 
ing.  All  of  these  methods  can  be 
classified  as  abducuve  methods,' 
hence  they  a|^ar  as  a  subtype  of 
Abduction  Metiuds.  In  general,  a  task 
can  be  accompibhed  using  any  one 
of  several  alternative  methods;  thus 
in  the  task  structure  we  can  explk- 
hly  identify  alternative  methods  for 
each  task.  A  method  can  set  up  sub¬ 
tasks,  which  themselves  can  be  ac- 


'Abducikm  it  (he  problem  of  reiinoinf  from 
effect  (o  cause. 
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complished  by  various  methods. 
For  example  in  the  Diagnosis  task 
structure  Abductive  Assembly  has 
been  decomposed  into  two  sub¬ 
tasks:  Generate  Plausible  Hypotheses 
and  Select  Hypotheses. 

Knowledge  in  a  task  struaure 
comes  in  four  forms.  First,  each 
task  must  be  accomplished  using 
knowledge  that  maps  the  input  of 
the  task  to  the  output.  Second, 
knowledge  must  indicate  when  an 
applicable  subtask  is  needed. 
Third,  when  a  method  consists  of 
subtasks,  knowledge  is  needed  to 
sequence  the  subtasks.  Fourth, 
when  a  task  can  be  accomplished 
using  two  or  more  methods,  knowl¬ 
edge  is  needed  to  select  a  method. 

The  following  subsections  de¬ 
scribe  the  task  structure  in  detail. 
The  section  “Examples  of  Task 

Part  of  the  task  structure 
for  Oiaonosis.  ardes  represent  tasks; 
rectangies  represent  methods.  See  sec¬ 
tion  "Examples  of  Task  structure  for 
Design  and  Diagnosis"  for  a  discussion 
of  the  role  of  simulation. 


Structure  for  Design  and  Diagno¬ 
sis"  discusses  the  task  struaures  for 
design  and  diagnosis  in  detail. 

Tasks 

1  asks  are  specified  as  transforming 
an  initial  proUem  state  with  certain 
features  to  a  goal  state  with  certain 
additional  features.  For  example,  in 
the  diagnostic  task  the  initial  state 
includes  malfunction  observations 
and  the  goal  state  includes  informa¬ 
tion  about  the  causes  of  the  mal¬ 
functions.  It  is  important  to  distin¬ 
guish  between  a  task  and  a  task 
instance.  A  task  insunce  is  a  partic¬ 
ular  problem/goai  state  pair,  such 
as  the  diagnosis  of  a  pardcular  pa¬ 
tient  with  specific  symptoms,  in 
contrast,  a  task  specifies  a  family  of 
task  instances  of  a  certain  typie.  Tliis 
family  can  be  defined  at  various 
levels  of  generality.  For  example, 
the  diagnosis  of  a  patient  with  spe¬ 
cific  symptoms  is  a  task  instance  of 
the  medical  diagnosis  task,  which  is 
itself  a  subclass  of  the  general  diag¬ 
nosis  task. 


Methods  and  Subtasks 

Methods  are  ways  of  accomplishing 
tasks  and  may  be  of  many  types: 
they  may  be  computational,  or  “sit¬ 
uated,"  (i.e.,  involve  extraaing  in¬ 
formation  from  the  surrounding 
physical  world).  For  example,  the 
task  of  predkting  behavior  of  a 
device  may  be  solved  by  a  computa¬ 
tional  method  that  performs  a  sim¬ 
ulation.  or  it  may  be  solved  bv  ma¬ 
nipulating  a  physical  model  of  the 
device  and  seeing  what  happens. 
Within  the  class  of  computational 
methods,  a  method  may  be  couched 
as  executing  a  precompiled  algo¬ 
rithm,  search  in  a  state  space,  as  a 
connectionist  network  and  so  on. 

Our  uniform  framework  for 
describing  methods  is  based  on 
the  problem-space  computational 
model^  [28]  and  was  adopted  as  a 
result  of  work  on  TIPS  [31],  an  ar¬ 
chitecture  for  dynamically  integrat¬ 
ing  generic  tasks,  and  work  done 
integrating  generic  tasks  using 
problem  spaces  in  the  Soar  archi- 
teaure  [18].  We  define  a  method  to 
be  a  set  of  subtasks  that  can  be  used 
to  transform  the  initial  state  of  a 
usk  to  the  goal  state.  The  method 
may  contain  additional  information 
about  ordering  the  subusks.  called 
search-control  knowledge  [21],  or  this 
knowledge  can  be  generated  at 
run  time. 

While  the  task  structure  allows 
the  specification  of  methods  of  dif¬ 
ferent  types,  those  that  are  mod¬ 
eled  as  problem-space  search  have  a 
special  role  for  two  reasons.  For 
one  thing,  one  way  to  understand 
knowledge  systems  as  a  distinct  type 
of  information  technology  is  to  note 
that  the  role  of  explicit  knowledge 
in  them  is  to  set  up  alternatives, 
evaluate  and  refine  them.  In 
MYCIN,  for  example,  the  knowl¬ 
edge  in  its  knowledge  base  enables 
it  to  set  up  and  evaluate  various 
baaerial  infection  hypotheses.  Sec¬ 
ond.  the  architeaure  that  inte¬ 
grates  the  different  methods  itself 
can  be  viewed  as  operating  in  a 


'For  an  example  of  a  direct  appbeation  of  the 
peobtem-space  comptnationu  model  lee  (30) 
in  chit  iwie. 
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search  space  uf  methods  and  mak¬ 
ing  selections  in  it.  Third,  we  will 
see  that  the  notion  of  subtasks 
emerges  naturally  in  the  frame¬ 
work  of  search  in  problem  spaces. 

lb  clarify  this,  let  us  consider 
how  to  represent  the  EsuUUish- 
Refine  method  for  hierarchical  clas¬ 
sification  (see  earlier  subsection 
"Background  Work  in  Knowledge 
.Modeling")  using  the  framework 
described.  Hierarchical  classifica- 
(ion  is  used  in  many  diagnosis  sys¬ 
tems  as  a  way  of  quickly  focusing  on 
possible  malfunctions.  The  initial 
state  of  the  classification  task  is  a  set 
of  data  (e.g..  manifestations  in  a 
diagnosis  task)  and  an  initial  high- 
level  hypothesis  (e.g..  liver  disease). 
The  goal  state  is  one  containing 
plausible  malfunction  hypotheses 
(i.e..  the  most  detailed  hypotheses 
consistent  with  the  data).  The 
method  works  bv  first  considering  a 
high-level  malfunction  category, 
such  as  liver  disease,  to  determine  if 
the  malfunctitin  appears  likely 
given  the  data  at  hand.  If  it  appears 
likelv.  then  the  malfunction  is  re¬ 
fined  to  more  specific  diseases, 
hepatitis  and  cancer,  for  example. 
The  more  specific  malfunctions  are 
then  evaluated  against  the  data  and 
any  that  appear  likely  are  refined. 
This  process  continues  until  no 
more  malfunctions  can  be  refined. 

We  can  specify  this  method  using 
two  subtasks: 

evaiuate  hypothesis 
refine  hypothesis 

The  first,  evaluau,  takes  some  hy¬ 
pothesis  (such  as  a  malfunction 
hypothesis)  and  assigns  a  likelihood 
based  on  the  current  case  data.  A 
precondition  for  applying  evaluate 
to  a  hypothesis  is  that  the  hypothe¬ 
sis  must  not  have  already  been  eval¬ 
uated.  The  second  subtask,  refine, 
takes  a  hypothesis  as  inpot  and  pro¬ 
duces  the  refinements  for  that  hy¬ 
pothesis.  Refine  has  two  precondi¬ 
tions:  the  hypothesis  must  be  likely 
and  must  not  have  already  been 
refined. 

We  must  also  spiecify  when  an 
operation  should  be  considered  for 


application  to  a  state.  For  the  hier¬ 
archical  classification  method  we 
are  describing,  the  operations  eval- 
luite  and  refine  should  be  considered 
whenever  their  preconditions  are 
met. 

The  initial  and  goal  sutes  and 
the  subtasks  define  a  search  spsace 
or  problem  spsace.  Figure  3  illus¬ 
trates  the  search  spsace  that  results 
when  the  method  described  is  ap>- 
plied  to  a  liver  diagnosis  problem. 
The  search  spsace  is  the  set  of  states 
reachable  from  the  initial  state  by 
applying  the  operators  for  the 
method.  The  figure  shows  p>art  of 


the  search  spsace  of  the  task,  beppn- 
ning  with  the  initial  state.  SI.  con¬ 
taining  manifestations  indicative  of 
a  viral  infection  (labeled  Data)  and 
the  high-level  hypxxhesis  liver  dis¬ 
ease.  The  only  opserator  appilieabie 
to  this  state  b  evaiuate  Ever  disease. 
Application  of  this  opieration  re- 


pteur*  %  Pan  of  ttw  search  space  for 
the  wararchlcal  dusWlcaaon  method 
as  ippead  to  Hear  disaase.  The  data  tor 
tNs  aaampte  are  indicative  or  a  viral 
MWtloa  HvpodMses  hi  pMn  text  are 
unevaiuatad;  those  m  boM  are  akelv: 
those  shrswn  m  sirtlia  theemh  are  un- 
Mtaiv. 
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suits  in  a  new  state.  S2,  in  which 
liver  disease  is  rated  likely  (in  the 
Figure  this  is  noted  by  setting  the 
hypothesis  in  bold  Ekc).  Only  one 
operator  is  applicable  to  S2,  refine 
Iwer  disease,  resulting  in  S3  which 
contains  the  refinements  of  liver  dis¬ 
ease:  cancer  and  infection.  At  S3  two 
operators  are  applicable:  evaluate 
cancer  and  ri’aluale  infection;  hence 
the  tree  brunches  to  show  both  pos¬ 
sibilities:  evaluate  cancer  results  in 
S4’  in  which  cancer  is  determined  to 
be  unlikely  and  evaluate  infection 
results  in  S4''  in  which  infection  is 
rated  likely. 

In  the  PSCM  framework,  prob¬ 
lems  are  solved  by  searching 
through  a  problem  space  for  a  path 
from  the  initial  state  to  the  goal 
state.  Problem-space  search  is  done 
bv  enumerating  subtasks  applicable 
to  the  current  state  (which  at  the 
start  of  problem-solving  is  the  ini¬ 
tial  state  of  the  task  instance),  se¬ 
lecting  from  these  a  single  subtask 
and  then  applying  that  operation  to 
the  current  sute.  The  resulting 
state  then  becomes  the  new  current 
state  and  the  whole  process  of  oper¬ 
ation  selection  and  application  is 
repeated  until  the  goal  state  is 
reached. 

.Search-control  knowledge  guides 
the  search  through  the  problem 
space.  For  example,  in  hierarchical 
classiFication  the  agent  might  apply 
a  heuristic  that  it  is  better  to  evalu¬ 
ate  hypotheses  with  higher  likeli- 
htMxls  than  those  with  low  likeli- 
h<K>ds.  or  it  might  decide  that  the 
decision  about  which  evaluate  oper¬ 
ator  to  apply  is  not  important, 
hence  either  operator  can  be  se¬ 
lected.  In  the  task  struaure,  we 
specify  the  minimum  amount  of 
search  knowledge  needed  for  each 
methtxl.  No  search  control  knowl¬ 
edge  is  specified  for  the  hierarchi¬ 
cal  classification  method  because 
any  such  knowledge  would  unduly 
constrain  the  method.  For  example, 
if  either  of  the  heuristics  mentioned 
previously  were  included  in  the 
search-control  knowledge  for  the 
methtxl.  it  would  limit  the  applica¬ 
tion  ol  the  method  to  those  do¬ 
mains  and  task  instances  in  which 


the  heuristics  apply. 

The  indcpcntlent  specification  of 
search  coiuritl  knowledge  and  sub¬ 
tasks  lead  tt»  two  tif  the  primary 
advantages  i>l  the  problem-space 
approach  to  s|M.-cifying  methods: 

1 .  In  the  task  structure  we  are  not 
forced  to  specify  a  particular  sub¬ 
task  sequence.  We  can  specify  the 
search<untrol  knowledge  that  is 
general  to  all  task  instances  fur  a 
method  and  defer  irther  decisions 
about  subtask  sequencing  to  system 
designers  or  run-time  computation. 
By  doing  this,  we  ensure  that  the 
methtxl  can  be  applied  to  as  wide  a 
range  of  task  instances  as  possible. 
In  contrast,  early  GT  work  often 
overconstrained  the  sequencing  of 
subtasks,  limiting  the  use  of  each 
method  to  a  narrow  range  of  prob¬ 
lems. 

2.  Search  control  knowledge  en¬ 
sures  a  dynamic  or  situated  system. 
Each  bit  of  search  control  knowl¬ 
edge  is  sensitive  to  the  subtasks  and 
the  current  state,  hence  the  precise 
sequence  of  subusks  is  determined 
dynamically  at  run  time. 

Method  Setectton  Knowledge 
Functionally,  there  are  four  types 
of  knowledge  in  a  task  structure. 
We  discussed  three  of  these  (search 
control,  subtask  application  and 
subtask  proposal  knowledge)  in  the 
previous  subsection,  since  they  are 
related  to  the  description  of  a 
method.  The  remaining  type, 
method  seleaion  knowledge,  is  as¬ 
sociated  with  a  task,  or  a  usk/ 
method  combination.  For  example, 
the  task  structure  for  Diagnosis  in¬ 
cludes  multiple  methods  for  evalu¬ 
ating  a  hypothesis.  When  a  system 
has  two  or  more  €)f  these  methods, 
method  selection  knowledge  must 
be  present  to  determine  the  best 
method  to  uke  for  the  task  insunce 
being  solved. 

Direct  vs.  Derived  Knowledge 
The  four  types  i»f  knowledge  in  a 
task  struaure  can  be  available  in 
two  forms:  it  can  he  directly  avail¬ 
able  for  the  task  or  it  can  be  com¬ 
puted  by  another  methtxl.  Directly 


available  knowledge  is  in  a  form 
that  maps  the  input  of  the  task  to 
the  output.  For  example,  direaly 
available  knowledge  for  refute  is  of 
the  form: 

If  task  is  refine  hypothesis  then  re¬ 
finements  are  rl.  r2.  r3  .  .  . 

For  instance: 

If  task  is  refine  Hirer  disease  then 
refinements  are  infection  and  can¬ 
cer. 

No  complex  computation  is  re¬ 
quired  to  use  this  knowledge  to  ac¬ 
complish  the  refine  task — the 
knowledge  is  in  a  form  directly 
applicable  to  the  task.  If  knowledge 
is  not  directly  available,  it  must  be 
derived  from  existing  knowledge  or 
acquired  from  the  external  envi¬ 
ronment.  In  either  case,  a  method 
must  be  used  to  acquire  knowledge 
of  the  desired  form.  For  example. 
refine  can  be  accomplished  using  a 
method  that  knows  about  different 
refinement  dimensions,  such  as  re¬ 
finements  along  etiologic  and  sub¬ 
part  relations.  This  method  can 
evaluate  various  dimensions  and 
then  select  the  dimension  appropri¬ 
ate  for  the  task  insunce.  For  exam¬ 
ple,  liver  disease  could  be  refined 
using  etiology  to  infection  and  cancer 
or  using  anatomic  structure  to  cen¬ 
tral-area  and  portal  area.  The  most 
appropriate  dimension  to  use  de¬ 
pends  on  the  usk  insunce  (i.e..  the 
kinds  of  manifesutions  available). 

Whenever  any  of  the  four  types 
of  knowledge  is  not  directly  avail¬ 
able.  subtasks  to  acquire  the  knowl¬ 
edge  can  be  created  and  set  up  as 
new  problems.  These  subusks  are 
viewed  like  any  other  task:  they 
have  an  initial  sute  and  a  goal  sute 
and  can  be  accomplished  by  the 
application  of  a  method  consisting 
of  a  set  of  operations.  Hence,  al¬ 
though  a  mahod  requires  ceruin 
kinds  of  km  iet'ge  to  be  applied  to 
a  usk,  this  knowledge  does  not 
have  to  be  known  before  problem¬ 
solving  can  begin,  but  can  be  dy¬ 
namically  acquired  or  derived  at 
run-time. 

This  idea  is  also  closely  related  to 
the  distinaion  between  “deep"  and 
"shallow''  knowledge,  sometimes 
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called  “deep"  and  “compiled" 
knowledge.  There  is  also  often  an¬ 
other  distinction  between  nuxiel- 
based  and  rule-based  reasoning, 
nuxlels  being  more  general  knowl- 
etlge  describing  the  principles  of 
the  domain,  while  rules  refer  to  rel¬ 
atively  ad  hoc  assiKiations  between 
evidence  and  hvftotheses.  In  [8],  we 
provide  an  analysis  of  these  terms 
and  develop  a  notion  of  "depth"  of 
knowledge  that  is  important  for 
knowledge  modeling.  We  give  a 
brief  description  of  this  idea. 

Let  K(T,.\f)  denote  the  knowl- 
etlge  needed  by  method  .\f  in  per¬ 
forming  the  task  T.  If  a  knowledge 
svstem  performing  T  using  .V/  has 
the  knowledge  K(T.\l)  directly 
available  in  its  knowledge  base,  let 
us  sav  the  knowledge  system  has  the 
knowledge  in  a  compiled  form.  How¬ 
ever.  suppose  some  knowledge  ele¬ 
ment  k  in  KlT.M)  is  missing  in  the 
knowledge  base,  and  the  task  of 
generating  this  knowledge  is  set  up 
as  a  suhtask.  If  there  exists  some 
other  body  of  knowledge  in  the 
knowledge  base,  say  A.",  so  that  by 
additional  problem-solving  using  K' 
we  can  generate  the  knowledge  ele¬ 
ment  k.  we  can  say  that  A"  is  deep 
relative  to  k. 

In  the  refine  example  we  saw  that 
anatomic  structure  is  one  of  the  di¬ 
mensions  along  which  refinement 
timid  be  done.  So-called  model- 
based  reasoning  is  an  approach  in 
which  structural  descriptions  of  the 
device  under  diagnosis  are  used  to 
generate  refinement  hypotheses. 
From  this  device  model,  we  can 
generate  a  list  of  malfunctions  (e.g., 
one  malfunction  category  can  be 
assigned  to  the  failure  of  each  of 
the  functions  of  each  component: 
moreover,  malfunction  categories 
can  correspond  to  errors  in  connec¬ 
tions  between  components).  The 
same  structural  model  can  be  used 
lo  generate  knowledge  needed  for 
the  evaluation  subtask  in  Figure  2. 
The  structural  mixiei  can  be  simu¬ 
lated  fur  each  malfunction,  and 
information  about  the  relation 
iK'twcen  malfunctions  and  obser¬ 
vations.  which  is  the  type  of' 
knowledge  needed  for  the  methods 


of  the  evaluation  subtask,  can  be 
generated  (see  the  next  section  for 
information  on  simulation).  Thus 
the  structural  model  is  a  deep 
model  for  the  methods  of  classifica¬ 
tion  and  hypothesis  evaluation  that 
are  generally  used  in  the  diagnostic 
task. 

The  approach  t«)  defining  the 
notion  of  depth  of  knowledge  in 
the  framework  of  the  task/ 
methods/knowledge  triple  general¬ 
izes  the  intuitive  notion  that  has 
equated  structural  models  with 
deep  models.  Under  «)ur  definition 
depth  is  a  relative  notion  (i.e..  it  is 
relative  to  a  method  for  a  task),  and 
there  is  no  notion  of  characterizing 
knowledge  as  deep  or  shallow  in 
some  absolute  way. 

Examples  of  Task  structure 
for  Design  and  Diagnosis 

The  specification  of  a  task  structure 
consists  of  three  parts: 

1.  an  input-output  relation  that 
denotes  the  task: 

2.  the  identification  of  methods 
and  their  subtasks  (as  in  Figure  2); 
and 

3.  knowledge  to  propose  subtasks, 
implement  subtasks,  sequence  sub¬ 
tasks  (5earch<ontrol  knowledge) 
and  select  methods. 

The  task-structure  diagrams  do  not 
list  the  kinds  of  knowl«lge  or  the 
input-output  relations  of  the  tasks: 
this  is,  however,  an  important  piart 
of  the  specification  of  the  task 
structure.  The  following  descrip¬ 


tions  of  design  and  diagnosis  illus¬ 
trate  the  main  points  about  specify¬ 
ing  the  task  structure. 

Part  of  the  task  structure  tor  de¬ 
sign  is  shown  in  Figure  4.  In  the 
task  structure  diagrams,  circles  rep¬ 
resent  tasks  and  recungles  repre¬ 
sent  methods.  The  top  task  for  the 
design  task  struaure  is.  of  course, 
design.  The  design  task  can  be 
solved  using  a  family  of  methods 
called  pTopose-critiepu-modify  (PCM) 
[7).  These  methods  have  the  sub¬ 
tasks  of  proposing  partial  or  com¬ 
plete  design  solutions,  critiquing 
the  proposals  by  identifying  causes 
of  failure,  if  any,  and  modifying 
proposals  to  satisfy  design  goals: 
hence  the  three  subtasks  shown  for 
PCM;  propose,  critique  and  mod¬ 
ify.  These  subtasks  can  be  com¬ 
bined  in  fairly  complex  ways,  but 
the  following  method  is  one 
straightforward  way  in  which  a 
PCM  method  can  organize  and 
combine  the  subtasks. 

Step  1.  Given  design  goal,  propose 
solution.  If  no  proposal,  exit  with 
failure. 

Step  2.  Verify  proposal.  If  verified, 
exit  with  success. 

Step  3.  If  unsuccessful,  critique 
propiosal  to  identify  sources  of  fail¬ 
ure.  If  no  useful  criticism  available, 
exit  with  failure. 

Step  4.  Modify  proposal;  return  to  2. 

F'lgurs.  4  Part  of  the  task  structure 
for  design  I7l.  ardes  represent  tasks: 
rectangles  represent  methods. 
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There  can  be  numerous  variants  on 
the  wav  the  methods  in  this  class 
work.  For  example,  a  solution  can 
be  proposed  for  only  a  part  of  the 
design  problem,  a  part  deemed  to 
be  crucial.  This  solution  can  then  be 
critiqued  and  modified.  This  par¬ 
tial  solution  can  generate  additional 
constraints,-  leading  to  further  de¬ 
sign  commitments.  Thus,  subtasks 
can  be  scheduled  in  a  fairly  com¬ 
plex  w'ay,  with  subgoals  from  dif¬ 
ferent  methods  alternating.  One 
could  generate  all  such  variations 
and  identify  them  ail  as  distinct 
methods,  but  both  the  need  for  de¬ 
scriptive  parsimony  and  the  sheer 
numerousness  of  the  methods  that 
would  result  argue  against  doing 
that. 

Each  of  the  PCM  subtasks  can  be 
achieved  using  various  methods. 
Three  such  families  of  methods  are 
shown  for  the  proposal  task  (see 
Figure  4);  decomposition,  case- 
based  and  constraint  satisfaction.  In 
decomposition  methods,  domain 
knowledge  b  used  to  map  subsets  of 
design  specifications  into  a  set  of 
smaller  design  problems.  The  use 
of  design  plans  is  a  special  case  of 
the  decomposition  method.  Case- 
based  methods  are  those  retrieving 
from  memory  cases  with  solutions 
to  design  problems  similar  or  close 
to  the  current  problem.  Constraint- 
satisfaction  methods  use  a  variety  of 
quantitative  and  qualitative  optimi¬ 
zation  techniques. 

Part  of  the  task  structure  for  di¬ 
agnosis  is  shown  in  Figure  2.  The 
diagnosis  task  can  be  viewed  as  an 
abductive  task,  the  construction  of  a 
best  explanation  (one  or  more  dis¬ 
orders)  to  explain  a  set  of  data 
(manifestations).  The  task  structure 
shows  three  typical  subclasses  of 
abductive  methods:  Bayesian,  ab¬ 
ductive  assembly  [19]  and  parsimo¬ 
nious  covering  [30].  Bayesian  meth¬ 
ods  require  knowledge  of  prior 
probabilities  of  disorders  and  con¬ 
ditional  probabilities  between  dis¬ 
orders  and  manifestations.  They 
use  this  knowledge  to  estimate  pos¬ 
terior  probabilities  of  disorders. 
.Abductive  assembly  requires 
knowledge  of  disorders  and  the 


manifestations  they  explain.  This 
method  works  by  first  generating 
pJausiMe  hypotheses  to  explain 
parts  of  the  data  and  then  using 
these  hypotheses  to  assemble  a 
complete  explanation  of  the  data. 
Parsimonious  covering  works  by 
stepping  through  each  manifesta¬ 
tion.  updating  the  current  set  of 
parsimonious  explanations  as  each 
manifestation  is  considered.  Two 
sublasks  for  abductive  assembly  are 
shown  in  the  diagram,  genfrale- 
plausible-kypotheses  and  select-hypothe¬ 
ses.  These  tasks  can  be  done  using 
many  kinds  of  methods.  Since 
Bayesian  and  classification  methods 
have  typically  been  used  to  gener¬ 
ate  plausible  hypotheses,  these  are 
shown  in  the  task  structure. 

The  task  structure  for  diagnosis 
also  shows  that  simulation  can  be 
used  to  implement  many  subtasks. 
By  simulation  we  mean  structure- 
to-behavior  simulation,  that  is,  de¬ 
termining  how  some  device  will 
behave  under  changes  to  its  struc¬ 
ture  by  simulating  its  behavior 
under  those  conditions.  We  have 
earlier  discussed  the  role  of  simula¬ 
tion  for  accomplishing  subtasks  (see 
previous  subsection  "Direct  vs.  De¬ 
rived  Knowledge"). 

The  knowledge  required  to  use 
abductive  assembly  consists  of  con¬ 
trol  knowledge  for  sequencing  sub¬ 
tasks  as  well  as  the  knowledge  re¬ 
quired  to  accomplish  the  subtasks. 
Ckintroi  knowledge  is  specific  to  an 
application  or  domain,  but  the 
knowledge  for  accomplishing  sub- 
usks  can  be  defined  using  the 
input/output  specifications  of  the 
subtasks.  CeneraU-plausiUe-hypothe- 
ses  takes  as  input  one  or  more  mani¬ 
festations  and  outputs  one  or  more 
disorders  that  could  be  used  to  ex¬ 
plain  those  manifesutions.  Select- 
hypothesis  takes  as  input  the  set  of 
manifesutions,  the  set  of  disorders 
currently  being  used  to  explain 
manifesutions,  and  the  set  of  dis¬ 
orders  that  could  be  used  to  explain 
one  or  more  additional  manifesu¬ 
tions  (i.e.,  the  output  of  generaie- 
plawMe-hypolheses).  The  output  of 
select-hypc^hesis  is  the  disorder  it  has 
determined  to  use  to  explain  the 


manifesutions.  Hence,  by  describ¬ 
ing  the  inpuuoutput  of  the  subtasks 
of  abductive  assembly  we  also  spec¬ 
ify  the  knowledge  required  to  use 
the  method.  Simulation  can  be  used 
to  evaluate  a  hypothesis  because  the 
simulation  can  reveal  whether  the 
hypothesis  is  possible  given  the  dau 
about  the  device.  Causal  refine¬ 
ments  of  a  category  can  be  deter¬ 
mined  by  simulating  to  determine 
the  possible  outcomes  of  a  set  of 
inputs  to  a  device.  Simulation  plays 
an  important  role  in  many  task 
structures  because  it  is  a  fairly  gen¬ 
eral  method  for  generating  knowl¬ 
edge  based  on  the  structure  of  a 
device.  We  did  not  show  the  simula¬ 
tion  method  in  the  design  task 
structure,  but  there  too  it  can  play 
an  imporunt  role,  especially  for 
critiquing  designs. 

These  task  s.ructures  are  based 
on  the  methods  and  subusks  im¬ 
plicit  in  many  expert  systems  that 
perform  the  tasks.  Neither  of  the 
usk  structures  is  meant  to  be  com¬ 
plete;  both,  however,  capture  a 
wide  range  of  the  methods  useful 
for  achieving  the  respective  tasks. 
As  we  discover  additional  methods, 
these  can  be  added  to  the  structure. 
Some  methods  (such  as  depth-first 
search)  are  so  general  they  can  be 
used  to  solve  any  problem.  These 
methods  are  not  listed  in  the  task 
structure  since  they  would  appear 
everywhere,  cluttering  the  dia¬ 
gram. 

The  task  structure  is  meant  to  be 
an  analytical  tool.  We  do  not  mean 
to  imply  that  the  implementation  of 
a  system  must  have  a  one-to-one 
correspondence  to  the  task  struc¬ 
ture.  but  that  a  system  that  per¬ 
forms  diagnosis  or  design  can  be 
vieioed  as  using  some  of  the  methocb 
and  subtasks.  In  particular,  the  task 
struaure  does  not  fix  the  order  of 
subusks  or  dicute  that  a  single 
method  must  be  used  to  achieve 
each  usk.  It  is  also  not  meant  to 
correspond  to  a  procedure-call  hi¬ 
erarchy.  although  that  is  one  way  to 
directly  implement  a  task  structure. 
The  task  struaure  simply  provides 
a  vocabulary  to  use  in  describing 
how  systems  work.  The  systenu 
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being  described  might  be  based  on 
neural  networks,  production  rules, 
frames,  or  a  task-specific  language; 
this  is  unimportant  for  the  use  and 
construction  of  the  task  structure. 

To  further  emphasize  the  impor¬ 
tance  and  role  of  the  task  structure 
for  describing  systems,  let  us  con¬ 
sider  three  descriptions  of  Red- 
Soar.  a  complex  abductive  system 
that  interprets  immunohematologic 
tests  in  order  to  identify  antibodies 
present  in  a  patient's  blood  [17].  In 
describing  a  complex  knowledge 
system  such  as  RedSoar,  we  can  use 
three  levels:  1)  the  task  structure;  2) 
a  computational  level  (such  as  prob¬ 
lem  spaces);  and  3)  a  symbol  level. 
Figure  3  shows  each  of  these  levels 
for  RedSoar.  RedSoar  uses  abduc¬ 
tive  assembly  of  antibody  hypothe¬ 
ses  to  construct  a  best  explanation 
of  the  test  data.  In  the  task,  the  test 
data  are  'he  manifestations;  anti¬ 
bodies  are  the  “disorders"  or  expla¬ 
nations.  RedSoar  is  directly  imple¬ 
mented  in  Soar's  production-rule 
language  and  can  be  described  by 
listing  all  the  rules  in  the  knowledge 
base,  such  as  those  in  Figure  3c. 
About  1,000  of  these  rules  consti¬ 
tute  the  symbol-level  view  of  Red¬ 
Soar.  However,  this  description 
fails  to  capture  the  task-level  con¬ 
trol  and  knowledge  in  the  system. 
To  do  this.  RedSoar  can  be  de¬ 
scribed  at  a  computational  level  by 
listing  the  problem  spaces  defined 
by  the  Soar  production  rules,  as  in 
Figure  3b.  That  is.  we  can  abstract 
away  from  the  symbol-level  pro¬ 
duction  rules  to  focus  on  the  prob¬ 
lem  spaces,  their  initial  and  desired 
states  and  their  operators.  This 
level  of  description  is  much  closer 
to  the  task  level,  but  would  still  con¬ 
tain  too  many  details  present  as  ar- 
tifacu  of  the  implementation  (e.g., 
extra  operators  that  must  be  used 
for  low-level  manipulation  of  rep¬ 
resentations).  At  the  task-struaure 
level  (see  Figure  3a),  we  can  simply 
describe  the  system  as  using  abduc¬ 
tive  assembly  and  then  point  out 
how  it  generates  and  selecu  hy¬ 
potheses:  i.e.,  the  methods  and 
knowledge  that  it  uses.  RedSoar 
uses  conditional  and  a  primi  proba¬ 


bilities  to  generate  plausible  hy¬ 
potheses  and  a  scoring  function 
based  on  explanatory  coverage  and 
plausibility  ratings  to  select  a  hy¬ 
pothesis.  As  shown  in  the  figure. 
RedSoar  also  uses  two  additional 
subtasks,  ruU-out  and  cor^irm  hypoth¬ 
eses.  These  are  domain-speciflc  sub¬ 
tasks.  The  first  allows  the  system  to 
quickly  rule  out  clearly  absent  anti¬ 
bodies.  The  second  lets  the  system 
focus  on  antibodies  that  are  likely 


present.  By  describing  RedSoar  at 
this  level,  a  comparison  can  be 
made  between  it  and  other  abduc¬ 
tive  assembly  systems  by  comparing 
the  methods  and  knowledge  used 
to  generate  and  select  hypotheses. 


Pi  nu  re.  $  RedSoar  descrtbed  at  three 
le^:  at  the  task  structure;  bl  the 
probtem-space  level  la  computational 
Mveli;  and  ct  production  rules  (the 
symbol  leveli. 
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Knowleclge  Modeling  and  the 
Task  Structure 

The  task  structure  described  in  the 
previous  section  facilitates  knowl¬ 
edge  modeling  in  several  ways. 
First,  it  associates  tasks  with  meth¬ 
ods  that  accomplish  them  and  the 
knowledge  required  to  use  the 
methods.  The  multiple  levels  of  the 
task  structure  show  how  knowledge 
can  be  decomposed  into  bodies  of 
knowledge  that  are  associated  with 
specific  casks.  The  cask  structure 
also  highlights  the  generality  and 
specificity  of  the  knowledge  needed 
for  a  problem-solving  method. 
That  is,  it  allows  methods  to  be 
compared  based  on  the  required 
knowledge.  Hence,  we  can  see  how 
some  methods  require  little  domain 
knowledge  (such  as  depth-first 
search,  which  only  requires  knowl¬ 
edge  to  recognize  a  goal  state), 
while  others  require  considerable 
domain  knowledge  (such  as  hier¬ 
archical  classification,  which  needs 
a  domain-specific  hierarchy  of  cate¬ 
gories). 

Second,  since  methods  are  char¬ 
acterized  by  the  knowledge  they 
require,  domains  can  be  modeled 
by  tools  appropriate  for  the  knowl¬ 
edge  that  is  available  in  the  domain. 
High-level  tools  based  on  this  con¬ 
cept.  such  as  CSRL  [3],  DSPL  [2], 
MUM  (16],  and  MOLE  (13],  illus¬ 
trate  how'  this  approach  facilitates 
knowledge  modeling,  knowledge 
acquisition,  explanation,  and  learn¬ 
ing. 

Third,  the  task  structure  view 
should  be  contrasted  with  what  one 
might  call  a  “uniform  normative 
algorithm”  view  of  how  to  solve 
complex  problems  such  as  diagno¬ 
sis  or  design.  Fur  example,  there 
have  been  proposals  for  a  general 
algorithm  for  diagnosis:  “diagnosis 
from  first  principles”  [32],  and 
Bayesian  networks  [29]  are  two  ex¬ 
amples.  The  general  algorithms, 
while  guaranteeing  an  optimal  so¬ 
lution  within  their  respeaive 
frameworks,  are  typically  intracta¬ 
ble.  In  these  cases  the  engineering 
of  systems  to  solve  the  tasks  is  done 
by  various  forms  of  heuristic  ap¬ 
proximations.  which  of  course  no 


longer  have  the  normative  proper¬ 
ties  asscKiated  with  the  original  al¬ 
gorithm.  The  general  algorithms 
also  do  not  alwavs  make  contact 
with  the  form  in  which  knowledge 
is  actually  available  in  various  real- 
world  domains.  Thus,  the  Bayesian 
framework  mav  be  fine  for  a  do¬ 
main  in  which  the  needed  prior  and 
conditional  probabilities  (or  good 
approximations  to  them)  are  avail¬ 
able,  but  in  other  domains  in  which 
the  domain  knowledge  takes  other 
forms,  there  is  often  a  need  for 
translating  from  these  forms  to  the 
probabilistic  forms  in  which  knowl¬ 
edge  is  needed. 

The  task  structure  view,  on  the 
other  hand,  views  the  solution  of 
complex  problems  as  arising  from 
the  interaction  of  many  local  meth¬ 
ods  for  local  usks.  In  any  domain 
having  a  record  of  successful 
human  problem-solving,  the 
knowledge  in  the  domain  helps  to 
decompose  the  task  into  manage¬ 
able  chunks,  so  each  of  the  prob¬ 
lems  can  be  solved  to  the  degree  of 
precision  and  accuracy  needed  for 
the  domain.  It  then  becomes  the 
usk  of  the  A1  theorist  to  develop 
vocabularies  of  generic  tasks,  meth¬ 
ods  and  knowledge.  Thus  the  at¬ 
tention  is  shifted  from  the  search 
for  uniform  algorithms  to  model¬ 
ing  knowledge  and  methods  by 
which  tasks  are  decomposed  and 
subtasks  are  accomplished. 

We  can  also  see  how  such  task 
structures  evolve  in  real-world 
domains.  If  classification  is  a  gener¬ 
ally  effective  method  for  the  gen- 
erate-hypotheses  subtask  of  diag¬ 
nosis.  then  over  time,  the 
problem-solving  community  devel¬ 
ops  the  knowledge  needed  to  apply 
it.  Thus  the  medical  community  has 
devoted  hundreds  of  years  to  the 
development  of  disease  taxono¬ 
mies,  which  is  the  form  in  which  the 
classification  method  needs  knowl¬ 
edge.  The  knowledge  compilation 
techniques  (see  subsection  “Deep 
vs.  Derived  Knowledge”)  are  also  a 
means  by  which  knowledge  in  a  less 
direct  form  is  converted  into 
knowledge  in  a  form  that  is  more 
directly  usaUe  by  a  computationally 


attractive  method.  Thus  we  see  that 
in  domains  and  tasks  of  impor¬ 
tance,  the  domain  knowledge  tends 
to  evolve  over  time  so  that  methods 
with  good  computational  proper¬ 
ties  can  be  supported. 

The  fact  that  we  do  not  start  with 
a  uniform  normative  algorithm 
does  not  mean  we  cannot  be  precise 
about  the  behavior  of  systems  built 
in  the  task-struaure  framework. 
Bylander  [3]  and  Cioel  [14]  are  ex¬ 
amples  of  analyses  in  which  the  role 
of  specific  types  of  knowledge  in 
producing  good  computational 
properties  can  be  studied  within  the 
general  framework  of  the  task- 
structure  view.  For  example,  Goel 
et  al.  show  why  classification  is  an 
attractive  method,  if  knowledge  in 
the  form  of  classification  hierar¬ 
chies  is  available,  and  Bylander 
et  al.  show  how  knowledge  about 
the  existence  of  certain  types  of 
causal  links  (and  nonexistence  of 
other  types)  makes  the  abductive 
assembly  method  tractable. 

Fourth,  the  task  structure  em¬ 
phasizes  that  different  kinds  of 
methods  can  be  combined:  quanti¬ 
tative  and  qualitative  knowledge, 
heuristic  and  algorithmic  knowl¬ 
edge  can  be  appropriately  com¬ 
bined  for  the  accomplbhment  of  a 
task.  For  example,  if  a  subtask  can 
be  achieved  using  a  known  tech¬ 
nique,  for  instance  by  solving  a  set 
of  differential  equations,  that 
method  can  be  usmI  instead  of 
more  traditional  A I  methods.  Since 
the  method  that  set  up  this  subtask 
is  concerned  with  the  solution, 
rather  than  how  it  was  determined, 
the  original  task  can  be  imple¬ 
mented  using  a  different  kind  of 
method  or  even  a  different  compu¬ 
tational  architeaure. 

Fifth,  the  generation  of  new 
knowledge  can  itself  be  viewed  as  a 
reasoning  task.  Hence  during 
knowledge  modeling  appropriate 
questions  can  identify  sources  of 
deep  knowledge  for  various  meth¬ 
ods  in  the  task  structure. 

Sixth,  the  task  struaure  outlined 
can  be  used  to  understand  the  dif¬ 
ferent  task-level  knowledge-nradel- 
ing  schemes  that  have  been  pro- 
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pt>sed.  which  were  reviewed  earlier 
(see  subsection  “Background  Work 
in  Knowledge  Modeling”).  KADS. 
as  well  as  Clancey's  heuristic  classi¬ 
fication  have  identified  extremely 
general  terms  or  tasks.  These  tasks 
can  be  used  to  describe  almost  any 
method.  Hence  they  can  be  consid¬ 
ered  a  set  of  primitive  knowledge- 
modeling  terms.  Generic  task  terms 
are  at  a  higher  level  of  abstraction 
in  the  task  structure.  They  are  not 
general  enough  to  be  used  to  de¬ 
scribe  all  methods,  but  can  be  used 
to  describe  how  higher-level  tasks 
such  as  diagnosis  and  design  can  be 
performed.  Generic  tasks  can,  in 
turn,  be  described  using  more 
primitive  terms.  This  is  in  fact  what 
has  been  done  throughout  the 
vears  of  research  on  generic  tasks — 
the  large-grained  tasks  have  been 
repeatedly  decomposed  into  finer- 
grained  tasks. 

Finally,  the  task  structure  clears 
up  the  confusions  discussed  earlier 
(see  section  “Need  for  Uniform 
Framework").  These  can  be  ad¬ 
dressed  as  follows; 

1.  The  relation  between  complex 
and  more  primitive  generic  tasks  is, 
in  one  sense,  relative  to  the  system 
being  described.  For  example,  if  a 
task  is  implemented  by  a  method 
that  spawns  subtasks,  we  say  the 
subtasks  are  more  primitive,  while 
the  higher  task  is  more  complex. 
The  concepts  are  relative  because 
any  task,  even  those  lower  in  the 
task  structure,  can  potentially  be 
implemented  using  a  complex  set 
of  subtasks.  In  another  sense,  we 
can  identify  tasks  appearing  as  sub¬ 
tasks  in  a  large  number  of  methods 
as  being  more  primitive  or  general 
than  tasks  appearing  in  fewer 
methods.  Hence  we  can  say  that 
diagnosis  is  a  less  general  and  more 
complex  task  than  dau  abstraction. 

2.  The  variety  of  knowledge¬ 
modeling  terms  that  have  been  pro¬ 
posed  is  due  to  researchers  looking 
at  different  parts  of  task  structures 
and  having  different  goals  in  mind. 
Some  have  looked  for  extremely 
primitive  terms  (e.g.,  Clancey  and 
KADS).  while  others  have  tricti  to 


identify  higher-level  terms  (e.g.. 
Steels  and  Generic  Tasks).  The  task 
structure  shows  how  these  terms 
can  be  related  through  tasks,  meth¬ 
ods  and  subtasks.  There  is  still  a 
problem  with  mapping  between 
terms  at  the  same  level;  however, 
Clancey's  system  model-construc¬ 
tion  perspective  (i.e.,  the  view  that 
what  KBSs  really  do  is  construct 
models  of  systems  they  are  reason¬ 
ing  about)  provides  a  scheme  to 
compare  these  terms  by  represent¬ 
ing  each  term  in  a  uniform  set/ 
graph/operator  language  [12]. 

3.  Overdetermination  and  rigidity 
ill  methods  are  avoided  by  using 
the  task  structure  because  a  com¬ 
plete  method  does  not  need  to  be 
specified  (only  the  subtasks  are 
given  and  not  all  of  these  have  to  be 
used  to  accomplish  a  task).  Funher- 
more.  multiple  methods  can  be 
used  to  model  domains  that  do  not 
warrant  the  selection  of  a  single 
method  for  accomplishing  a  task. 
Overdetermination  and  rigidity  of 
implementation  can  be  avoided  by 
dynamically  determining  methods 
and  subtask  sequencing  at  run¬ 
time.  Details  of  how  this  can  be 
done  in  the  context  of  generic  tasks 
are  given  in  [18];  but  the  basic  idea 
is  to  dynamically  determine  what  to 
do  at  each  problem-solving  step. 
That  is,  after  each  operation  is  per¬ 
formed  the  situation  is  reassessed  to 
determine  what  can  be  done  next. 
Knowledge  is  then  brought  to  bear 
to  select  one  of  these  operations. 

KnmuUdg*  MoMing  1$  a  Toik- 
Sp€cifie  Enterprue 
There  have  been  some  attempts  to 
develop  a  small  number  of  basic 
terms  in  which  to  formulate  all  the 
knowledge  to  be  represented  in 
KBSs.  We  think  it  premature  to  dis¬ 
cuss  representing  knowledge  in 
general  at  this  point  in  our  under- 
sunding.  We  can,  however,  for  var¬ 
ious  types  of  tasks,  develop  detailed 
theories  of  the  methods  and  knowl¬ 
edge  required  to  implement  them 
and  the  terms  in  which  such  knowl¬ 
edge  can  be  represented.  Thus  the 
knowledge-modeling  methodology 
is  a  cumulative  enterprise  by  re¬ 


searchers  around  the  world:  as  re¬ 
search  in  some  task  (say  diagnosis 
or  design)  is  carried  on  in  various 
domains  around  the  world,  differ¬ 
ent  methods  are  identified,  their 
knowledge  requirements  under¬ 
stood,  generalizations  and  com¬ 
monalities  recognized,  perfor¬ 
mance  characteristics  of  the 
methods  are  analytically  under¬ 
stood,  and  a  task  structure  which 
incorporates  this  collective  product 
of  research  emerges.  Knowledge¬ 
modeling  for  that  particular  task  is 
then  facilitated  by  this  task  struc¬ 
ture:  we  know  what  kinds  of  knowl¬ 
edge  and  strategies  are  needed  for 
the  methods  and  can  use  the  terms 
of  analysis  to  model  the  knowledge 
in  the  domain. 

In  this  sense,  over  the  last  several 
years  significant  knowledge  has 
been  accumulated  for  the  following 
tasks;  diagnosis,  hierarchical  de¬ 
sign.  configuration  tasks  and  some 
classes  of  device  simulation  tasks. 
We  have  outlined  the  task  structure 
for  some  of  these  tasks,  and  shown 
how  the  modeling  of  knowledge  is 
facilitated  by  this  framework. 

The  usefulness  of  the  task  analy¬ 
sis  in  the  form  proposed  in  this  arti¬ 
cle  (and  the  resulting  task  struc¬ 
ture)  is  not  limited  to  automation  of 
problem-solving.  The  analysis  itself 
is  merely  a  description  of  how  the 
task  might  be  decomposed  and 
what  kinds  of  knowledge  are 
needed.  It  is  possible  that  for  one 
reason  or  another  some  of  the 
methods  may  not  be  automatable: 
the  needed  knowledge  may  not  be 
available  in  a  computer-processable 
form,  or  the  method  might  itself 
not  be  sufficiendy  operationalized. 
The  task  analysis  still  provides  the 
tool  for  decomposing  a  task  and 
identifying  which  subparts  of  an 
overall  task  can  be  automated.  The 
subtasks  for  which  automauUe 
methods  do  not  exist  at  a  given 
suge  of  AI  theory-making  can  be 
simply  direaed  to  a  human  prob¬ 
lem  solver  with  expertise  in  the  sub¬ 
task.  Thus  the  task  structure  pro¬ 
vides  the  framework  for  natural 
human-machine  cooperation.  As 
we  undersund  how  to  operational- 
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ize  a  previously  unautomated 
Rieth<xl  and  we  acquire  the  knowl¬ 
edge  needed  for  it,  that  subtask  can 
be  given  <»ver  to  the  machine.  The 
task  structure  thus  provides  a  mo¬ 
bile  boundary  between  human  and 
machine  in  problem-solving. 
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Form  and  Content  Issues  in  the  Abductive  Framework  for  Recognition^ 

B.  Chandrasekaran,  John  R.  Josephson  and  Susan  G.  Josephson^ 

Laboratory  for  AI  Research 
The  Ohio  State  University 
Columbus,  OH  43210,  USA 

Introduction 

In  an  earlier  paper,  Chandrasekaran  and  Goel  (1988)  considered  different 
approaches  to  the  task  of  classification.  The  classification  problem  is  one  of  mapping 
from  observations  (data)  to  pre-enumerated  classes.  As  the  classification  problem  grows 
complex,  the  solutions  evolve  from  simple  one  step  numerical  mapping  to  use  of 
intermediate  abstractions  (symbols)  to  rules  that  use  relations  between  abstractions  to 
complex  reasoning  with  knowledge,  summarized  in  the  progression: 

numberS‘->  abstractions  (symbols)  -->  relations  -->  knowledge  structures 
We  compared  pattern  classification  approaches,  connectionist  networks,  syntactic 
methods  and  finally  knowledge-based  classifrcation.  The  task  of  classification 
particularly  was  deemed  important  since  it  is  so  ubiquitous,  playing  a  role  in  visual  and 
speech  recognition,  diagnosis  and  numerous  other  tasks  of  importance.  Over  the  last 
several  years,  however,  a  broader  framework  called  Abduction  has  been  gaining  attention 
for  many  of  the  same  problems  that  have  been  traditionally  cast  in  the  mold  of 
classification.  In  this  paper,  we  discuss  the  structure  of  the  abductive  task  and  relate  it  to 
the  problem  of  recognition  in  general.  We  will  note  that  classification  still  has  a  role  as  a 
component  in  abduction. 

Form  versus  content 

One  particular  idea  that  didn’t  seem  to  come  through  very  clearly  in 
Chandrasekaran  and  Goel  (1988)  is  the  issue  of  form  versus  content.  Often  much  of  the 
discussion  in  the  field  of  AI  and  pattern  recognition  seems  to  revolve  around  whether  this 
mechanism  or  that  (connectionist  nets,  or  logic  or  symbolic  approaches)  is  the  right  way 
to  go  about  building  a  system  to  solve  some  problem  (say  for  visual  recognition).  We 
have  argued  elsewhere  (Chandrasekaran,  Goel  and  Allemang,  1989)  that,  for  many 
purposes,  mechanism  questions  are  not  the  right  kind  of  questions  to  ask,  at  least  not  in 
the  beginning.  These  questions  should  be  deferred  until  an  undentanding  of  the  content 
of  a  task  is  obtained.  When  the  task  structure  is  properly  understood,  then  decisions 
about  what  kinds  of  mechanisms  ate  appropriate  for  what  subtasks  can  be  nuule.  Marr 
(1982)  of  course  has  also  made  similar  a^ments,  and  Newell  (1981)  is  also  credited 
with  similar  remarks  in  his  discussion  on  the  Knowledge  LeveVSymbol  Level  distinction. 
In  spite  of  these  well-known  analyses,  there  is  sdll  not  a  sufrcient  understanding  of  die 
importance  of  content  issues  in  most  of  the  research  in  the  field. 


^For  presentation  at  Irru^e  Processing:  Theory  and  Afjplications,  An  In^mational 
Conference,  S»iremo,  Italy,  June  16, 1993. 

^Also  writti  tfie  Columbus  College  of  Art  and  Design,  Columbus,  Ohio,  USA 


66 


Form  and  Content  Issues  in  the  Abductive  Frameworic  for  Recognition 


In  this  talk,  we  will  discuss  an  abductive  framework  fw  recognition,  and  present 
the  discussion  largely  in  terms  of  the  content  of  tlM  task:  what  kinds  of  information  are 
needed,  what  kinds  of  subtasks  arise  the  course  of  performing  abduction,  and  so  on.  The 
work  that  is  presented  here  is  elaborated  in  greater  detail  in  (Josephson  and  Josephson, 
1994). 


Abduction 

Abduedon  is  a  general  framework  for  many  important  problems  in  cognition  and 
perception.  Abduction  has  been  used  to  frame  the  problem  of  diagnosis,  scientific  theory 
formation,  natural  language  understanding,  and  ~  of  particular  relation  to  the  subject  of 
this  conference  ~  abduction  is  a  more  general  framework  than  classification  for  visual 
recognition 

Abduction  or  itrference  to  the  best  e^qjlanation  is  a  form  of  inference  that  goes 
from  data  describing  something  to  an  explanatory  hypothesis  that  best  explains  or 
accounts  for  the  data.  Thus  abduction  is  a  kind  of  theory-forming  or  interpretive 
inference.  We  take  abduction  to  be  a  distinctive  kind  of  inference  following  this  pattern 
pretty  nearly^: 

D  is  a  collection  of  data  (facts,  observations,  givens). 

H  (hypothesis)  explains  D  (would  if  true,  explain  D). 

No  other  hypothesis  is  able  to  explain  D  as  well  as  H  does. 


Therefore,  H  is  probably  true. 

The  core  idea  is  that  a  body  of  data  provides  evidence  for  a  hypothesis  that 
satisfactorily  explains  or  accounts  for  that  data  (or  at  least  it  provides  evidence  if  the 
hypothesis  is  better  than  explanatory  alternatives). 

We  can  see  quite  easily  why  visual  or  auditory  recognition  fits  this  pattern  of 
inference.  The  task  is  specif!^  by  a  set  of  data  (pixel  intensities,  or  auditory  signal 
amplitudes  at  different  times),  and  the  goal  is  to  come  up  with  a  hypothesis  (a  description 
of  the  scene  in  terms  of  objects  and  their  locations,  or  words  that  foe  speaker  uttered  as 
he  or  she  produced  foe  sound  pattern)  that  best  corresponds  to  foe  cause  that  produced  the 
data.  Of  course,  as  a  rule,  different  hypotheses  could  explain  foe  same  data,  i.e,  foe  same 
pixel  intensity  distribution  or  temporal  sound  amplitude  distribution  could  have  been 
caused  by  different  objects  or  words,  so  foe  best  that  a  perceive  can  do  is  to  come  up  with 
foe  hypothesis  that  best  accounts  for  foe  data. 

Sometimes  a  distinction  is  made  between  an  initial  process  of  conung  up  with 
explanattnily  useful  hypothesis  alternatives  and  a  subsequent  process  of  critical 
evaluation  wherein  a  dwisitm  is  made  as  to  which  explanation  is  best  Smnetimes  the 
term  “abduction”  has  been  restricted  to  the  hypothesis-generation  phase.  We  use  the 
term  here  fw  the  whole  process  of  generation,  criticism,  and  acceptance  of  explanatory 
hypotheses.  One  reason  is  that,  although  foe  explanattny  hypotheses  in  abduction  can  be 
simple,  more  typically  they  are  composite,  multipart  hypotheses.  A  scientific  thewy  is 


3  This  tormulalion  is  largely  due  to  Wilton  Ly^ 
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typically  a  composite  with  many  separate  parts  holding  together  in  various  ways^,  and  so 
is  our  understanding  of  a  sentence,  and  our  judgment  of  a  law  case,  as  well  as  our 
interpretation  of  what  we  see.  But  no  feasible  informadon-processing  strategy  can  afford 
to  explicitly  consider  all  possible  combinations  of  pcMentially  usable  theory  parts,  since 
the  number  of  combinations  grows  exponentially  with  the  number  of  parts  available 
(Josephson  and  Josephson,  1994).  Reasonably-sized  problems  would  take  cosmological 
amounts  of  time.  So,  one  must  typically  adopt  a  strategy  which  avoids  generating  all 
possible  explainers.  Some  pre-screening  of  theory  fragments  to  remove  those  that  are 
implausible  under  the  circumstances  makes  it  possible  to  radically  restrict  the  potential 
combinations  that  can  be  generated,  and  thus  goes  a  long  way  towards  taming  the 
combinatorial  explosion.  But  such  a  strategy  breaks  the  clean  separation  between  the 
process  of  coming  up  with  explanatory  hypotheses,  and  the  process  of  acceptance, 
because  it  mixes  a  degree  of  critical  evaluation  into  the  process  of  hypothesis  generation. 
Thus,  computationally,  it  seems  best  not  to  neatly  separate  generation  and  acceptance. 

We  take  “abduction”  to  include  the  whole  process  of  generation,  criticism,  and  possible 
acceptance  of  explanatory  hypotheses. 

It  seems  to  us  that  perception  is  abduction  in  layers.  We  present  here  a  layered- 
abduction  computational  model  of  perception  which  unifies  bottom-up  and  top-down 
processing  in  a  single  logical  and  information-processing  framework.  In  this  model  the 
processes  of  interpretation  are  broken  down  into  discrete  layers,  where  at  each  layer  a 
best-explanation  composite  hypothesis  is  formed  of  the  data  presented  by  the  layer  or 
layers  below,  with  the  help  of  information  from  above.  The  formation  of  such  a 
hypothesis  is  a  process  of  abductive  inference.  The  model  treats  perception  as  a  kind  of 
frozen  or  “compiled”  deliberation. 

Perception  as  compiled  deliberation 

There  is  a  long  tradition  of  belief  in  Philosophy  and  Psychology  that  perception 
relies  on  some  form  of  inference  (Kant,  1787;  Bruner,  1957;  Gregory,  1987;  Fodor, 
1983).  But  this  has  been  typically  thought  of  as  soiik  form  of  deduction,  or  simple 
recognition,  or  feature-based  classification,  not  as  abduction.  In  recent  times  researchers 
have  occasionally  proposed  that  perception,  or  at  least  language  understanding,  involves 
some  form  of  abduction  or  explanation-based  inference  (Chamiak  and  McDermott,  1985, 
p.557;  Chamiak,  1986;  Dasigi,  1988;  Josephson,  1982,  pp.  87-94;  Fodor,  1983,  pp.  88, 
104;  Hobbs,  Stickel,  Martin,  et  al.  1988).  As  Peirce,  the  philosopher  who  cniginally 
coined  the  term  “abduction”,  wrote,  “Abductive  inference  shades  into  perceptual 
judgn»nt  without  any  sharp  line  of  demarcation  between  them”  (Peirce,  1902,  p.  304). 

Thinking  of  perception  as  inferential  is  not  to  suppose  that  perceptitm  uses 
deliberative  reasoning.  The  abductive  inferei^s  we  hypothesize  to  occur  in  peiceptitMi 
are  presumably  very  efficient,  with  very  little  explicit  search  going  on  at  run  time 
(perception  time).  Moreover,  some  of  the  parts  of  abductive  processing  might  not  be 
fully  realized  by  actual  processing  at  run  time,  or  they  may  be  done  very  efficiently  with 
no  extraneous  support  processing.  For  example  alternative  hypotheses  may  be  prestored, 
and  simply  activated  during  perception,  rather  than  generated  fresh.  Similarly, 


^  For  example  Darden  (1991 )  dssoibes  tie  moduiarfty  o(  genetic  theory. 
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hypothesis  interactions,  such  as  degrees  of  compatibility  and  incompatibility,  may  be 
prestoied.  This  kind  of  storing  of  knowledge  in  just  the  right  form  for  efficient  task- 
specific  processing  is  just  what  we  mean  by  “compiled”  cognition.  It  is  a  kind  of 
thinking  well  without  thinking  much. 

Cognition  might  be  compiled  either  as  “software”  or  as  “hardware”  (or  in  some 
intermediate  “firmware”  form).  Moreover,  compilation  might  be  done  by  evolution,  or 
through  some  sort  of  learning,  or  incrementally  as  needed  for  perception-time  processing. 
Hardware-compiled  structures  are  presumably  not  plastic,  while  learned  structures  can  be 
subsequendy  revised  (in  principle  anyway).  Thus  particular  abductive  mechanisms 
might  be  formed  either  by  evolution,  by  learning,  or  at  run  time.  The  point  of 
compilation  for  perception  is  to  avoid  computationally  expensive  mn-time  search.  This 
can  be  done  by  compiling  hypothesis  fiagments  and  evidential  links,  as  we  said.  These 
evidential  links  may  be  implemented  by  currents  running  along  wires,  firing  rates  of 
connecdons  between  neurons,  weighted  symbol  associations,  or  in  some  other  way;  but 
however  they  are  implemented,  they  will  still  really  be  evidential  links  (that  is,  this  level 
of  description  is  not  dispensable). 

llius,  one  way  in  which  perception  may  plausibly  be  hypothesized  to  be  very 
much  like  deliberation  is  that  the  steps  and  dependencies  should  make  sense  logically 
(abductively).  Each  piece  of  processing  should  be  justifiable  in  ways  like  “. . .  is 
apparendy  the  only  plausible  interpretation  for  this  datum,”  “. . .  combine  to  make  this 
hypothesis  better  ^an  that  one,”  “. . .  was  ruled  out  because . . .,”  and  so  on. 

Another  bridge  firom  perceptual  abduction  to  deliberative  reasoning  is  that  there 
are  functionally  similar  kinds  of  impasse  that  can  occur  during  processing.  Each  such 
impasse  creates  a  need  to  fall  back  on  some  other  form  of  processing,  and  provides  an 
opportunity  to  learn.  One  such  impasse  type  is  where  there  are  no  good  hypotheses  to 
account  for  some  firmly  established  data  item.  In  this  case  (if  there  is  time)  a 
deliberative  strategy  might  be  to  derive  a  new  hypothesis  with  the  needed  properties  fiom 
background  causal  knowledge.  Apparendy  this  goes  on  in  diagnosis  in  domains  where 
there  are  good  causal  models  of  the  system  being  diagnosed.  Another  way  to  handle  this 
type  of  impasse  is  to  first  capture  a  description  of  the  data  to  be  accounted  for,  assume  it 
can  be  accounted  for  by  some  new  hypothesis-forming  concept  C,  and  begin  to  build  up 
more  description  of  C  based  upon  what  can  be  inferred  or  reasonably  conjectured  from 
the  context.  This  additional  analysis  presumably  occurs,  not  so  much  immediately  at  run 
time  in  the  heat  of  the  original  problem  solving,  but  shordy  thereafter  at  “learn  time,” 
when  run-time  information  is  still  available,  but  urgency  has  subsided  and  resources  are 
available  for  more  expensive  processing.  This  learning  strategy  is  neutral  between 
deliberative  and  perceptual  aMuction. 

A  nice  example  of  compiled  abduction  in  perception  is  that  of  3-dimensional 
vision.  The  data  to  be  explain^  are  the  disparities  between  die  innages  presented  by  die 
two  eyes;  the  explanatory  hypodieses  are  somediing  like  overiapping  planes,  i.e.,  front- 
back  relationships  across  edge  boundaries. 

Perception  as  Abduction  in  Layers 

Computational  models  of  information  processing  for  things  like  vision  and 
spoken  language  understanding  have  commonly  supposed  an  orcteriy  progression  of 
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layers,  beginning  near  the  tedna  or  audittny  periphery,  where  hypotheses  are  formed 
about  “low-level”  features,  e.g.,  edges  (in  vision)  or  bursts  (in  speech  perception),  ami 
proceeding  by  stages  to  higher-level  hypotheses.  Models  intended  to  be  comjnel^sive 
often  suppose  three  or  more  major  layers,  often  witii  sublayers,  and  sometimes  with 
parallel  channels  that  separate  and  combine  to  support  higtwr-level  hypotheses.  For 
example,  shading  discontinuities  and  color  contrasts  may  separately  support  hypotheses 
about  object  boundary  (Mart,  1982;  Lesser,  et  al.,  1975).  Recent  work  tm  primate  vision 
appears  to  show  the  existence  of  separate  channels  for  information  about  shading, 
texture,  and  color,  not  all  supplying  information  to  tte  same  layers  of  interpretation 
(Livingsone  and  Hubei,  1988).  Those  easily  and  naturally  are  separate  channels  which 
then  need  to  converge  so  as  to  form  a  unified  hypothesis  concerning  what  is  seen. 

Another  need  for  layered  processing  in  perception  comes  fnnn  tiie  problem  of 
combining  information  from  the  Afferent  senses.  Combining  infomnation  from  different 
senses  is  functionally  no  difrerent  from  combining  information  from  different  channels 
within  a  single  sense.  The  different  senses  are  simply  different  channels  to  central, 
higher  “senses.”  Separate  channels  within  the  visual  system  deliver  up  the  data  useful  at 
a  certain  level  to  form  hypotheses  about  the  locations  of  3-d  objects;  similarly,  both  sight 
and  hearing  can  deliver  up  the  data  useful  for  forming  hypotheses  about  object  identity. 
Think  of  distinctive  bird  calls,  for  example,  or  a  lecturer  whose  identity  is  uncertain. 

One  special  problem  for  multi-sense  integration  is  the  problem  of  identifying  a 
“that”  deliverol  up  by  one  sense,  with  a  “that”  delivered  up  by  another.  Which  person  is 
the  one  that  is  speaking?  Is  the  same  object  being  seen  in  the  infrared  as  that  being  seen 
in  the  ultraviolet?  Logically,  it  should  be  possible  fm  infmmation  derived  from  one 
sense  to  help  with  resolving  distinct  objects  within  the  other  sense.  There  is  actually 
some  evidence  that  vision  can  help  hearing  to  separate  distinct  streams  of  tones  (Massaro, 
1987,  p.  83)  and  hear  the  tone  stream  as  two  distinct  auditory  objects. 

One  useful  representation  for  multi-sense  object  perception  is  that  of  a  “hot  map” 
consisting  of  overlaid  spatial  representations,  i.e.,  spatial  representations  from  the 
different  senses  brought  into  “registry”  or  “correlat^”  into  a  single  spatial 
representation.  Thus,  for  example,  a  robot  might  bring  together  separate  channels  of 
information  from  its  visual  and  tactile  senses  to  form  a  unified  spatial  representatitm  of 
its  immediate  surroundings.  Such  a  map  should  be  maintained  continually,  and  updated 
and  revised  as  new  information  arrives  and  is  interpreted.  This  hot  map,  with  its  symbols 
on  it,  can  be  seen  as  the  resulting  composite  hypothesis  formed  by  a  process  of  layered 
abductive  interpretation. 

In  vision  most  of  the  processing  of  information  is  presumably  bottom-up,  from 
information  produced  by  the  sensory  organ,  through  intermediate  representations,  to  die 
abstract  cognitive  categories  that  are  used  fm  reasoning.  Yet  top-down  processing  is 
presumably  also  significant,  as  higher-level  information  imposes  biases,  and  he4>s  with 
identification  and  disambiguation.  Vision  can  thus  be  thought  of  as  a  layered 
interpretation  task  wherein  the  output  from  one  layer  becomes  data  to  be  interpreted  at 
the  next.  In  image  understanding  the  layen  are  something  like  this: 

Retinal  Activation  -->  segmentation,  edges,  regions  ->  2-1/2D  overlapping  planes, 
occlusions  ->  grouping  ->  object  ID,  scene  analysis 

Layers  may  be  fixed  in  advance  or  formed  at  tun  time.  Ouq;>ut  from  me  layer 
beccnnes  data  to  be  interpreted  at  the  next  layer.  Layered  interpretation  models  for  non- 
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perceptual  interpretive  processes  make  sense  too.  Ftx*  example,  medical  diagnosis  can  be 
thought  of  as  an  inference  that  typically  proceeds  from  syn^)toins.  to  pathological  states, 
to  diseases,  to  disease  processes  to  etiologies.  In  speech  conqnehension,  for  exan^ile.  the 
layen  might  be  acoustic  signals  to  phonetics  to  grammar  and  clauses  etc.  to  semantics. 
Perception,  speech  comprehension  and  medical  diagnosis  all  have  a  similar  pattern  oi 
layers  of  abductive  i»ocessing.  Similar  to  perception,  medical  diagnosis  is  presumably 
mostly  bottom  up,  but  with  a  significant  amount  of  top-down  processing  serving  similar 
functions. 

It  is  reasonable  to  expect  that  perceptual  processes  have  been  optimized  over 
evolutionary  time  (become  efficient,  not  necessary  optimal),  and  that  the  specific  layers 
and  their  hypotheses,  especially  at  lower  levels,  have  been  compiled  into  qieciai-iNirpose 
mechanisms.  Within  the  life  span  of  a  sin^e  organism,  perceptual  learning  provides 
additional  opportunities  for  compilation  and  t^timization.  Nevertheless,  it  seems  that  at 
each  layer  of  interpretatitui  the  abstract  infcumation-ptocessing  task  is  the  same:  that  of 
forming  a  coherent,  composite  best  explanation  of  the  data  from  the  previous  layer  or 
layers.  That  is,  the  task  is  abduction,  and  in  particular,  abduction  requiring  the  formation 
of  composite  hypotheses. 

If  the  information  processing  that  occurs  in  the  various  layers  and  senses  is 
functionally  similar,  then  perhaps  their  mechanisms  are  similar  too  at  a  certain  level  of 
description.  Thus  we  are  led  to  hypothesize  tiiat  the  information-processing  mechanisms 
that  occur  in  vision,  hearing,  understanding  spolmn  language,  and  in  interpreting 
information  from  other  senses  (natural  and  robotic),  are  all  variations,  incomplete 
realizations,  or  compilations  (domain-specific  optimizations)  of  one  basic  cmnputational 
mechanism.  Thus  we  propose  what  we  may  call  the  layered-abduction  model  of 
perception.  What  is  new  in  this  nnodel  is  the  specific  hypothesis  that  perception  uses 
abductive  inferences,  occurring  in  layers,  together  with  a  specific  computational  model 
of  abductive  processing. 

Task  Structure  Model  of  Abduction  in  Layers 

A  task  structure  (Chandrasekaran,  1990;  Chandrasekaran  and  Johnson,  1993)  is  a 
representation  of  a  task  in  terms  of  the  methods  that  are  applicable  for  it  in  the  domain 
arid  the  conditions  under  which  each  method  is  applicable.  A  task  analysis  is  a  functimial 
decomposition  of  an  information  processing  task.  Such  an  analysis  is  intended  to  answer 
questions  about  how  it  is  possible  to  accomplish  the  task,  especially,  how  the  task  can  be 
feasibly  accomplished,  llius,  the  task  is  divided  into  subtasks  and  those  subtasks  are 
divided  into  sub-subtasks,  and  so  on.  The  subtasks  are  those  things  which  are  required  to 
accomplish  the  task,  and  the  sub-subtasks  are  those  things  which  are  required  to 
accomplish  the  subtasks. 

A  task  structure  analysis  identifies  domain-specific  and  domain  independent 
aspects  of  the  tasks  and  methods  itx  their  accomplishment  The  task  can  be  feasibly 
accomplished  if  a  decomposition  can  be  found  such  that  each  subtask  is  feasible  and  can 
be  combined  with  the  other  subtasks  by  a  feasible  method. 

F6r  each  task  (subtask,  sub-subtask,  etc.)  there  are  one  or  more  possible  metiiods 
that  can  be  used  to  accomplish  it  Methods  are  ways  to  accomplish  some  task,  like  using 
an  abacus  or  using  your  fingers  or  using  a  calculamr  are  all  alternative  metiiods  for 
perfonning  the  task  of  addition.  Each  method  is  itself  specified  in  tenns  of  how  it  uses 
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knowledge  and  inference  to  achieve  its  goals,  and  in  terms  of  what  subgoals  (subtasks)  it 
sets  up  and  requires  to  be  achieved  bef(»e  it  can  succeed.  This  kind  of  decomposidon 
can  be  done  recursively  until  methods  which  achieve  subgoals  but  which  do  not  set  up 
additional  subgoals  of  their  own  are  reached.  Alternative  methods  for  accomplishing  die 
task  may  make  use  of  a  common  subtask,  and  so  on  recursively. 

A  task  can  have  one  or  more  methods  associated  widi  accomplishing  it  Each  of 
the  methods  is  characterized  by  forms  of  knowledge  and  inference  that  are  necessary  for 
carrying  out  the  method  and  by  additional  subtasks  that  will  need  to  be  achieved  in  cnder 
to  complete  the  applicadon  of  the  method  for  that  task.  A  method  can  be  a  procedure 
where  the  sequencing  of  steps  is  all  prespecified,  but  it  can  be  toon  abstract;  in  Newell’s 
problem  space  terminology  [Newell,  19^),  it  can  be  a  search  in  a  problem  space.  (In 
fact,  such  methods  are  the  ones  that  are  interesting  from  an  AI  point  of  view.)  Fw 
example,  the  problem  of  classification  has  a  method  called  hierarchical  classification, 
which  consists  of  exploring  the  classification  hypotheses  organized  as  a  hierarchy.  This 
calls  for  knowledge  in  the  form  of  hierarchies  and  inference  methods  which  are 
variadons  of,  and  include  as  a  default  strategy,  top-down  explorations  of  the  hierarchy. 
This  method  has  subtasks  in  the  form  of  evaluating  the  evidence  for  or  against  a 
hypothesis  so  that  it  can  be  established  or  rejected.  This  subtask  similarly  can  have  many 
methods  associated  with  it.  each  of  which  is  characterized  by  its  own  knowledge  and 
inference  requirements. 

We  can  analyze  the  task  of  abducdon  to  have  two  main  subtasks: 

•  subtask  1)  generating  elementary  hypotheses, 

•  subtask  2)  synthesizing  composite  explanations. 

The  subtask,  generating  elementary  hypotheses,  has  two  sub-subtasks: 

•  sub-subtask  1.1)  evoking 

•  sub-subtask  1.2)  instantiating  elementary  hypotheses 

The  sub-subtask  1.2.,  instantiate  elementary  hypothesis,  has  two  sub-sub-subtasks: 

•  sub-sub-subtask  1.2.1)  scoring  the  elementary  hypotheses 

•  sub-sub-subtask  1.2.2)  detemtining  explanatory  coverage. 

This  decomposition  is  very  general.  The  typical  abducdve  answer  is  a  composite, 
and  somehow,  explicitly  or  implicitly,  there  must  be  some  method  of  choosing  which 
elementary  hypothesis  to  consider,  sorm  way  of  making  them  specific  to  the  case,  and 
some  way  of  accepdng  and  combining.  Given  this  task  structure,  the  informadon 
processing  at  each  layer  is  decomposed  into  three  funcdonally  disdnct  types  of  acdvity: 
evocation  of  hypotheses,  instantiation  of  hypotheses,  which  are  the  two  sub-subtasks 
making  up  the  first  subtask,  and  composition  of  hypotheses,  which  makes  up  the  second 
subtask  of  the  overall  abducdve  task. 

Evocation  (sub-subtask  1.1)  can  occur  bottom-up,  a  hypothesis  being  sdmulated 
for  consitferadons  which  are  cued  by  the  data  presented  at  a  layer  below.  That  is,  the 
presence  of  a  certain  finding  suggests  that  certain  hypotheses  are  appropriate  to  consider. 
More  than  one  hypothesis  may  be  suggested  by  a  given  datum.  Evocadon  can  also  occur 
tq;>-down,  either  as  the  result  of  priming  (an  expectadon  from  a  level  above),  or  as  a 
consequence  of  data-seeking  activity  from  above,  which  can  arise  from  the  need  for 
evaluation.  As  when  you  what  to  dwide  whether  the  person  coming  towards  you  is 
really  Jane,  and  so  you  look  for  Jane’s  diaracterisdc  nnole  on  die  left  cheek.  Evocadons 
can  generally  be  p^ormed  in  parallel,  and  need  not  be  synchronized. 
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Instamiation  (sub-subtask  1.2)  occurs  when  each  stimulated  hypothesis  is 
independently  scored  for  confidence  (sub-sub-subtask  1.2.1),  and  adetenmnation  is 
made  about  what  part  or  aspect  of  the  data  is  accounted  for  by  die  hypothesis  (sub-sub¬ 
subtask  1.2.2 ).  Instantiation  is  in  general  t(q}-down.^  During  instantiation  data  may  be 
sought  that  was  not  part  of  the  tviginal  stimulus  for  evoking  a  hypothesis.  Each 
hypothesis  is  given  a  confidence  value  on  some  scale,  which  can  be  taken  to  be  a  “local 
match”  or  prima  facie  likelihood,  a  likelihood  oi  being  true  based  only  on  consideration 
of  the  match  between  the  hypothesis  and  die  data,  with  no  consideration  of  interactitms 
between  potentially  rival  or  otherwise  related  hypotheses.  Typically,  many  evtdted 
hypotheses  will  get  very  low  scores,  and  can  be  tentatively  eliminated  from  further 
consideration.  The  data  that  are  accminted  for  by  a  hypodiesis  may  or  may  not  be 
identical  to  the  data  on  the  basis  of  which  the  hypothesis  was  scor^  or  dte  data  that  did 
the  evoking. 

In  the  course  of  instantiation  the  hypothesis  set  may  be  expanded  by  including 
subtypes  and  supertypes  of  high-confidence  hypotheses,  if  the  space  of  potential 
hypotheses  is  organized  hierarchically  by  level  of  specificity.  Instantiation  is  done  most 
efficiently  when  it  is  based  (mi  matehing  against  prestored  patterns  of  features,  but  slower 
processes  of  instantiation  are  also  possible  whereby  the  features  to  match  are  generated  at 
run  time. 

The  result  of  a  wave  of  instantiation  activity  is  a  set  of  hypotheses,  each  with 
some  measure  of  confidence,  and  each  offering  to  account  for  sonK  portion  of  the  data. 
Since  within  a  particular  wave  of  instantiation,  hypotheses  are  consi^red  independently 
of  each  other,  the  process  of  instantiation  can  go  on  in  parallel. 

Composition  ( subtask  2)  occurs  when  the  instantiated  hypotheses  interact  with 
each  other  and  (under  good  conditions)  a  coherent  best  interpretation  emeiges.  At  the 
beginning  many  hypotheses  will  probably  have  intermediate  semes,  representing 
hypotheses  that  can  neither  be  taken  as  practically  certain,  nor  as  being  of  such  a  low 
confidence  as  to  be  ignmable.  So  knowledge  of  interactimis  between  the  hypotheses  is 
brought  to  bear  to  reduce  the  degree  of  uncertainty,  increasing  confidence  in  some  of 
them,  and  decreasing  confidence  in  others.  (See  Josephson  and  Josephson,  1994,  for 
details  of  strategies.) 

Top-down  information  flow 

In  the  layered-abduction  computational  model  composite  hypotheses  are  formed 
in  sevmal  places  at  once  in  a  coordinated  fashion.  Each  locus  of  hypothesis  fmmation 
we  call  an  agora,  after  the  market  place  where  the  ancient  Creeks  would  gather  for 
dialog  and  debate.  The  idea  is  that  an  agma  is  a  place  where  hypotheses  of  a  certain  type 
gather,  and  contend,  and  where  under  good  conditions  a  consensus  hypothesis  emerges. 

In  typical  cases  the  emerging  hypothesis  will  be  a  emnposite,  coherent  in  itself,  and  with 
different  sub-hypotheses  accounting  for  different  portions  of  the  data.  For  example  in 
vision  the  “edge  agora”  is  the  presumed  location  where  edge  hypotheses  are  fom^  and 


s  Undw  oarttun  drcumslatwM  it  is  good  enough  just  to  score  m  t»  basil  of  Noting*  by  tw 
tram  tMiow,  and  twn  no  topdoanproossiing  need  occur,  to  Istit  tor  scoring.  IhoughMsstratigyisissstoan 
togioaRyidsto,  it  is  ootnpultoionaly  lass  expsniiw,  to  toast  in  the  short  run  Thtorapressntiahindofoomplation 
whare  accuracy  is  traded  for  quicknass. 
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accepted,  each  specific  edge  hypothesis  accounting  for  certain  specific  data  firtnn  Iown- 
level  agoras. 

The  agoras  are  organized  in  an  information-flow  network  with  a  clear  sense  of 
direction  defining  the  difference  between  bottom-up  and  top-down  flow.  Bottom-up  is 
from  data  to  interpretation.  The  main  output  from  an  agora  is  “upwards,”  the  data  to  be 
interpreted  by  “higher”  agoras.  Another  possible  output  is  “downwards,”  for  example 
expectation  might  influence  the  consideration  and  evaluation  of  hypotheses  at  low» 
agoras.  Thus  the  relationship  between  “data”  and  “explanation”  is  a  relative  one,  and  an 
explanation  accepted  at  one  agotr,  becomes  data  to  be  explained  at  the  next  Though  we 
sometimes  use  the  word  “layer”  to  describe  an  agora  or  contiguous  cluster  of  them,  we 
do  not  suppose  that  the  agoras  are  all  neatly  lined  up.  The  paths  may  be  branching  and 
joining  (but  no  cycles  are  permitted).  The  basic  strategy  is  to  try  to  solve  the  overall 
abduction  problem  at  a  given  agora  by  solving  a  sufficient  number  of  smaller  and  easier 
abductive  sub-problems  at  that  agora. 

Downward-flowing  information  processing  between  layers  can  occur  in  at  least 
four  ways.  One  is  that  the  data-seeking  needs  of  hypothesis  evaluation  or  discrimination 
can  provoke  instantiation  (top-down  evocation  and  evaluation  of  a  hypothesis).  Another 
is  that  expectations  based  on  tirmly  established  hypotheses  at  one  layer  can  “prime” 
certain  data  items  (i.e.,  evoke  consideration  of  them  and  bias  their  score  upwards).  A 
third  way  is  that  hypotheses  that  are  uninterpretable  as  data  at  the  higher  level  (no 
explanation  can  be  found)  can  be  “doubted,”  and  reconsideration  of  them  provrdced  (also 
reconsideration  of  any  higher-level  hypotheses  whose  confidence  depend^  on  the 
questionable  datum).  Finally,  data  pairs  may  be  jointly  uninterpretable,  as  for  example  in 
vision  when  the  image  has  two  mutually  incompatible  interpretations  (to  some  degree  of 
strength),  and  reconsideration  can  be  provoked  from  above.  In  these  ways  higher-level 
interpretations  can  exert  a  strong  influence  on  the  formation  of  hypotheses  at  lower 
levels,  and  layer-layer  harmony  is  a  two-sided  negotiation. 

We  may  summarize  the  control  strategy  for  hypothesis-composition  by  saying 
that  it  employs  multi-level  and  multiple  intra-level  island-driven  processing.  Islands  of 
relative  certainty  are  seeded  by  local  abductions  and  propagate  laterally 
(incompatibilities,  positive  associations),  downwards  (expectations),  a^  upwards 
(accepted  hypotheses  become  data  to  be  ttecounted  for).  Processing  occurs  concurrently 
and  in  a  distributed  fashion.  Higher  levels  provide  soft  constraints  through  the  impact  of 
expectations  on  hypothesis  evocation  aiKl  scoring,  but  this  does  not  strictly  limit  the 
hypotheses  that  may  be  accepted  at  lower  levels. 

Conctusion 

Notice  the  level  of  description  we  have  given  here.  This  architecture  is  a  high- 
level  architecture  and  is  potentially  compatible  widt  a  number  of  different 
implementation  architectures,  such  a  neural  nets  or  traditional  symbolic  programming  it 
might  be  implemented  as  an  “algorithmic  coinputer”,that  is,  an  instruction  follower,  or  as 
a  “connectionist  computer,”  whose  prirrutive  processing  elements  wok  by  propagating 
activations.  We  have  described  the  functional  and  semantic  signifkanoe  of  various 
Mtions  of  the  machine,  and  the  flow  of  contrtrf,  but  not  precisely  how  these  actions  are 
‘nV***t*8*ti«l  on  some  underi3ring  machiite.  That  is,  twice  we  view  die  proi^m  in  a  task 
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Structure  way,  we  have  a  content^ven  perspective.  The  layered  abduction  view  does 
not  commit  us  to  using  connecdonism,  or  symbolic  processing.  It  is  a  theory  of  the 
information  processing  task,  that  is,  a  computational  theory  which  is  a  desoiption  of  the 
input  output  relations,  and  by  saying  it  occurs  in  layers  and  how  it  occurs  at  each  level  in 
terms  of  subtasks,  etc,  we  are  describing  the  strategy  for  the  task.  This  task  stracture 
level  is  above  the  level  of  algorithms  and  data  structures,  or  nodes  and  weights,  which  in 
turn  are  a  level  above  the  level  of  implementation.  In  this  sense  the  task  structure 
analysis  is  a  content  theory. 

A  task  structure  has  an  enormous  anwunt  of  leverage  in  directing  knowledge 
acquisition  and  system  building,  since  the  knowledge  and  inference  requirements  for  the 
methods  can  be  explicitly  identified-  The  task-oriented  view  has  significant  potential  to 
aid  learning.  In  addition  to  the  knowledge-type  vocabulary  that  it  provides  for  each  task, 
its  ability  to  generate  explanations  at  the  right  level  can  give  significant  leverage  for 
learning  (Chandrasekaran,1986, 1987,  Bylander  and  Chandrasekaran,  1987). 
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Abstract.  In  this  paper  we  present  tome  of  the  work  and  ideas  devet* 
oped  at  the  Ohio  State  Laboratory  for  Al  Research  on  explaining  the 
behavior  of  knowledge  systems.  The  first  part  of  the  paper  presents  an 
analysis  of  the  explanation  problem  and  the  aspects  of  it  that  «ve  have 
concentrated  on  (briefiy,  w«  are  concerned  more  with  the  form  and  con¬ 
tent  of  the  representations  than  the  explanation  form  or  presentation). 
Then  we  describe  a  generic  task-based  approach  to  explanation,  includ¬ 
ing  relating  the  explanation  to  the  logical  structure  of  the  task.  Finally, 
we  show  how  causal  models  of  a  domain  can  be  used  to  give  explanations 
of  diagnostic  dedsioas. 


1  Aspects  of  Explanation 

As  described  by  Chandraaekaran,  Tanner,  and  Joaephaon  [8],  we  can  separate 
the  explanation  generation  problem  in  knowledge  systems  into  three  top-level 
functions:  generating  the  content,  being  responsive,  and  interacting  with  human 
users. 

Gonerating  an  explanation’s  basic  content.  Given  user  queries  about  a  sys¬ 
tem's  decisions,  we  need  to  generate  an  information  structure  containing  the 
elements  needed  for  an  explanation. 

Shaping  explanations  to  match  user  knowledge.  It  may  not  be  necessary 
to  communicate  all  the  available  explanation  content  to  users.  Systems  apply 
knowledge  of  user  goals,  state  of  knowledge,  and  the  dialog  structure  to 
filter,  shape,  and  organise  the  output  of  the  above  centeat  process  so  that 
explanations  respond  to  user  needs. 

Intaraeting  with  users.  The  two  preceding  functions  produce  all  the  informa¬ 
tion  needed  conceptually  and  logically  for  the  required  explanation.  However, 
presentation  issues  remain;  specifically,  how  an  appropriate  human-computer 
interface  effectively  displays  and  presents  information  to  users. 

*  Parts  of  this  paper  appeared  is  B.  Chandraaekaran,  M .  C.  Taaser,  aad  J.  R.  Joeeph- 
ses,  "Cxpiainiag  control  strategMain  problem  solving,*  IEEE  Espert,  4(1),  pp.  9-34. 
1999,  and  M.  C.  Tanner  and  A.  M.  Kesneke.  *The  role  of  the  task  atractere  aad  do- 
mais  functional  models,*  tEEE  Esptrt,  6(3),  1991,  pp.  S0-S7.  Reused  by  petmiasion 
of  IEEE  Computer  Society. 
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If  explanation  content  is  inadequate  or  inappropriate  -  no  matter  how  good 
theories  for  responsiveness  and  interface  functions  are  -  then  correspondingly 
poor  explanations  will  be  presented.  Thus,  generating  the  correct  explanation 
content  is  the  central  problem  in  explanation  generation.  We  can  break  this  down 
into  the  following  types: 

Step  explanatioa.  Kelating  portions  of  the  data  in  a  particular  case  to  the 
knowledge  for  making  specific  decisions  or  choices,  i.e.,  explaining  the  steps 
in  the  solution  process. 

Strategic  explanation.  Relating  decisions  to  follow  particular  lines  of  reason¬ 
ing  to  the  problem  solver’s  goals. 

Task  explanation.  Relating  the  system’s  actions  and  conclusions  to  the  goals 
of  the  task  it  performs. 

Knowledge  justification.  Relating  conclusions  and  problem-solving  knowl¬ 
edge  to  other  domain  knowledge,  possibly  showing  how  they  were  obtained. 

These  four  kinds  of  explanation  are  related  to  the  act,  or  process,  of  solving 
problenu.  A  knowledge  system  might  be  a^ed  questions  about  many  other  rel¬ 
evant  things,  including  requests  for  definition  of  terms  and  exam-like  questions 
that  test  the  system’s  knowledge.  Answers  to  these  questions  may  or  may  not 
be  explanations,  as  such,  but  a  knowledge  system  should  still  be  able  to  produce 
them.  All  of  these  kinds  of  explanation  correspond  to  structures  that  must  be 
examined  when  constructing  explanations,  even  though  some  of  the  structures 
may  not  be  needed  to  solve  problems  in  the  system’s  domain. 

Often  explanations  are  produced  by  introspection,  i.e.,  a  program  examines 
its  own  knowledge  and  problem-solving  memory  to  explain  itself.  Step  and  strate¬ 
gic  explanations  are  most  often  done  this  way.  But  sometimes  explanations  are 
concocted,  i.e.,  they  do  not  relate  to  how  the  decision  was  actually  made,  but 
independently  make  decisions  plausible.  Constructing  such  post  /sets  justifica¬ 
tions  or  explanations  is  necessary  when  problem  solvers  have  no  access  to  their 
own  problem  solving  records,  or  when  the  information  contained  in  those  records 
is  incomprehensible  to  users.  The  explanation  may  argue  convincingly  that  the 
answer  is  correct  without  actually  referring  to  the  derivation  process,  just  as 
mathematical  proofs  persuade  without  representing  the  process  by  which  math¬ 
ematicians  derive  theorems.  Task  explanations  and  knowledge  justifications  are 
often  done  this  way.  Generating  explanations  of  this  sort  is  an  interesting  prob¬ 
lem  solving  process  in  its  own  right  [23.  39|. 

2  'DmIcs,  Methods  and  Explanations 

In  this  ssetioa  we  giv«  a  brief  outline  of  the  notion  of  a  task  analysis  as  de¬ 
veloped  by  Chandraeekaran  [3].  Explaining  a  knowledge  system’s  solutions  re- 
quiraa,  among  other  things,  shewing  on  one  hand  how  the  logical  requiremenu 
^  the  task  were  aatisAed  ^  the  solution  and.  on  the  other  hand,  ab^ng  how 
the  method  adopted  (the  strategy)  achieved  the  task  in  the  problem-solving  in¬ 
stance.  In  principle  there  may  be  more  than  one  method  for  a  task.  Moat  knowl¬ 
edge  systems  have  '‘hard-wired”  specific  methods  for  the  tasks.  Thus  Mycin  can 


b«  understood  as  solving  the  diagnostic  task  by  the  method  of  heuristic  ciassi- 
hcacion,  which  in  turn  performs  the  subtasks  of  data  analysis,  heuristic  match, 
and  refinement  [9].  In  the  generic  task  (GT)  framework,  developed  at  Ohio  State, 
we  have  identified  a  number  of  task-method  combinations  that  can  be  used  as 
building  blocks  for  more  complex  t.'sks.  Thus,  for  example,  the  task  of  diagnosis 
is  associated  with  a  generic  method  called  abductive  assembly,  which  in  turn 
sets  up  a  subtask  of  hypothesis  generation.  In  our  GT  work,  a  generic  method 
called  hierarchical  classification  is  proposed  for  exploring  certain  types  of  hy¬ 
pothesis  spaces.  This  in  turn  sets  up  subtasks  for  evaluating  hypotheses  in  the 
hierarchy  for  which  a  generic  method  called  hierarchical  evidence  abstraction  is 
proposed.  What  we  have  called  GTs  are  in  fact  task-method  combinations.  The 
method  is  particularly  appropriate  to  the  task  because  it  is  commonly  used  in 
many  domains  for  that  task  and  it  gives  significant  computational  advantages. 
Two  of  the  GTs  we  have  identified  are  Hierarchical  Classification  and  Design  by 
Plan  Selection  and  Refinement.* 

Hierarchical  Classification 

Task:  If  a  hypothesis  hierarchy  is  available,  generate  hypotheses  that  match 
the  data  describing  a  situation. 

Method:  For  each  hypothesis,  set  up  a  subtask  to  establish  or  reject  it.  If 
it  is  established,  test  its  successors.  If  it  is  rejected,  it  and  its  successors 
are  rejected.  The  top-down  control  strategy,  called  Establish-Refine,  can 
be  varied  under  specific  conditions.  Bylander  and  Mittal  [2]  elaborate  on 
this  simplified  account. 

Design  by  Plan  Selection  and  Refinement 

Task:  Design  an  object  that  satisfies  certain  specifications. 

Method:  Design  is  separated  into  a  hierarchy  of  subdesign  problems,  mir¬ 
roring  the  object’s  component  structure.  For  each  node  in  the  hierarchy, 
there  are  plres  for  making  commitments  for  some  component  param¬ 
eters.  Each  component  is  designed  by  choosing  a  plan,  based  on  some 
specifications,  which  instantiates  some  design  parts  and  designs  further 
subcomponents  to  fill  in  other  parts.  We  describe  this  task  in  more  detail 
in  Sect.  3,  but  Brown  and  Chandrasekaran  [1]  is  the  definitive  reference 
on  this  topic. 

Each  GT  method  is  explicitly  supported  by  a  high-level  language  that  aids  knowl¬ 
edge  system  development  by  giving  the  knowledge  engineer  access  to  tools  that 
work  closer  to  the  problem  level,  not  the  rule  or  frame  level.  However,  it  may 
he  necsasary  to  divide  noo-trivial  problems  into  subprobletns  such  that  each 
matches  some  GT.  This  way  of  building  complex  knowledge  systems  also  means 
that  knowledfe  engineering  environments  should  provide  a  tool  set  rather  than 
a  single  tool.  Consequently,  some  of  our  recent  work  has  concentrated  on  devel¬ 
oping  the  Generic  Task  Toolset  (21). 

*  The  (oUowiag  dcKiiptioa  differs  from  descriptions  in  earlier  papers  [4],  siace  we  have 
separated  the  task  aad  the  method  explicitly. 
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In  using  the  GT  theory  for  explanation  we  need  to  show  how  the  method- 
specific  high-level  language  helps  in  explicitly  and  directly  generating  explana¬ 
tions  of  strategy  at  the  right  level.  This  is  what  our  early  work  involved,  and  is 
described  in  Sect.  3.  In  general,  however,  we  also  need  to  relate  the  logical  struc¬ 
ture  of  the  task  to  the  strategy  employed.  Issues  involved  in  this  are  discussed  in 
Sect.  4.  Then  in  Sects.  5  and  6  we  describe  work  on  justifying  problem-solving 
knowledge  by  reference  to  the  more  general  knowledge  on  which  it  is  based. 

3  Generic  Tasks  and  Explanation  -  An  Example 

GTs  make  strategic  explanation  possible  for  systems  built  using  this  theory  [8]. 
Additionally,  any  explanation  facility  should  be  able  to  explain  the  steps  a  prob¬ 
lem  solver  takes  in  reaching  a  solution.  In  this  section  we  describe  MPA  (a  Mis¬ 
sion  Planning  Assistant)  [18],  a  GT  program  capable  of  explaining  its  problem¬ 
solving  steps  and  its  strategy.  We  transferred  the  techniques  developed  on  this 
system  to  the  Generic  Task  Toolset  [21]  so  that  any  system  built  using  those 
tools  could  explain  its  steps  and  strategy. 

.MPA  is  a  GT  implementation  of  Knobs  [13],  a  system  that  plans  a  particular 
kind  of  Air  Force  mission.*  The  type  of  planning  requited  can  be  viewed  as  de¬ 
sign,  i.e.,  designing  the  plan.  So  we  used  the  GT  of  Daign  by  Plan  Selection  and 
Refinement,  and  implemented  MPA  using  the  generic  task  tool  DSPL  (Design 
Specialists  and  Plans  Language)  [1]. 


3.1  Overview  of  OSPL 

A  design  problem  solver  in  DSP  L  is  a  hierarchy  of  specialists,  each  responsible  for 
a  specific  design  portion  (see  Fig.  1).  Specialists  higher  up  in  the  hierarchy  deal 
with  the  more-general  aspecu  of  devices  being  designed,  while  specialisu  lower 
in  the  hierarchy  design  more-specific  subportions.  The  organisation  of  the  spe¬ 
cialists,  and  the  specific  content  of  each,  is  intended  to  capture  design  expertise 
in  the  problem  domain. 

Each  specialist  contains  design  knowledge  necessary  to  accomplish  a  portion 
of  the  design  (see  Fig.  2).  Each  specialist  has  several  types  of  knowledge  but, 
for  simplicity,  we  will  describe  only  three.  First,  explicit  design  plans  in  each 
specialist  encode  sequences  of  possible  actions  to  successfully  complete  that  spe¬ 
cialist’s  task.  The  plans  consist  of  design  sups,  each  of  which  chooses  a  value 
for  oos  pnnmcter  of  the  design.  Second,  specialisu  have  design  plan  sponsors 
assodnlsd  erith  each  plan  to  determine  that  plan’s  appropriateness  in  the  run¬ 
time  eonlaact.  And  third,  each  specialist  has  a  design  plan  selector  to  examine 
runtioM  judgmenU  of  sponsors  and  to  determine  the  plan  most  appropriau  to 
the  current  problem  context. 

In  a  DSPL  system,  control  proceeds  from  the  topmost  design  specialist  to  the 
lowest.  Each  specialist  aslecu  a  plan  appropriau  to  the  problem's  tequiremenU. 

*  MPA  sctaally  implcmcau  a  very  simplified  vetsioa  of  the  problem. 
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The  system  executes  plans  by  performing  design  actions  that  the  plan  specifies 
(which  may  include  computing  and  assigning  specific  values  to  device  attributes, 
checking  constraints,  or  invoking  subspecialists  to  complete  another  portion  of 
the  design). 

3.2  Description  of  MPA 

In  MPA  the  top-level  specialist  is  OCA  (for  Offensive  Counter  Air,  the  type 
of  mission  MPA  works  on),  which  has  design  plans  for  producing  the  mission 
plan.  Subcomponents  of  the  mission  are  the  Base  and  the  Aircraft  (see  Fig.  1). 
The  Base  specialist  chooses  the  air  base  (actually  a  list  of  bases)  using  the 
requirements  of  the  mission.  The  Aircraft  specialist  chooses  an  aircraft  type, 
and  then  configures  the  aircraft  for  the  mission  using  subspecialists. 

As  an  example  of  a  specialist,  consider  Aircraft  (shown  in  Pig.  2).  It  contains  a 
selector,  in  this  case  it  is  the  default  selector  built  into  DSPL.  The  default  selector 
simply  chooses  the  best  plan,  according  to  ratings  assigned  by  the  sponsors,  and 
if  there  are  no  good  plana  it  fails.*  Aircraft  also  contains  three  sponsors,  one  for 
each  of  its  plans.  It  has  a  plan  for  each  aircraft  type  (A-10,  F-4,  and  F-111). 

The  DSPL  code  for  Aircraft’s  A-10  Plan  is  given  in  Fig.  3.  MPA  decides 
whether  A- 10  is  the  appropriate  aircraft  type  for  the  mission  using  its  sponsor- 
selector  mechanism  described  above.  If  A- 10  is  impropriate,  this  plan  is  executed. 
The  BOOT  contains  a  list  of  the  steps  in  the  plan.  It  first  notes  the  aircraft  type, 
then  chooses  a  squadron.  The  base  is  determined  from  the  chosen  squadron  and 
then  the  range  to  the  target  is  (;omputed.  Finally,  the  aircraft  is  configured  for 
the  mission  (bomb  and  fuel  load)  by  calling  the  subspecialist  Configure- A- 10. 


(PUB 

(BAKE  A-10) 

(SPOBSOB  A-lO-Speaser) 

(PURPOSE  ' ‘ coasidering  the  feasibility  of  an  A-10  for  the  aissioa’’) 
(ACBIEVEO  ‘‘chose  aa  A-IO  for  the  aissioa") 

(SOOT 

Aasi^Aireraf  tType 
CheeaeSqaadrea 

Ammigmmmmm 

OeUUafs 

(BCSXOB  Coafigare-A-lO))) 


liPA’$  A-iO  Plan 


One  of  tha  stops,  ChoosaSqaadroh,  is  shown  in  Fig.  4.  Steps  in  DSPL  set  the 
value  of  a  single  design  attribute,  jt»  the  step  first  identifies  the  attribute  it  seu 

*  DSPL  coataias  a  mediaaiom.  sot  described  here,  for  dcaliag  with  failures.  Brown 
sad  Giaadrasekaraa  [I]  provide  the  detaih. 
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(SqUAO).  The  DSPL  functions  KB-FETCH  and  KB-STORE  fetch  and  store  attribute 
values  of  the  design.  The  KIOVI  section  of  the  body  is  a  facility  for  defining  local 
variables  for  known  attributes.  BEPLY  contains  the  main  work  of  the  step.  In 
this  case  the  REPLY  simply  stores  an  attribute  value  but,  in  general,  it  could  do 
more  work  to  decide  on  the  value.  The  Lisp  function  sqoad-salsct  implements 
the  expert's  method  of  choosing  a  squadron  given  the  aircraft  type  and  bases 
available.  Since  this  is  done  in  a  Lisp  function,  it  will  not  be  easily  explainable. 
A  better  implementation  of  MPA  would  include  this  decision  process  in  DSPL. 
rather  than  in  Lisp. 


(STEP 

(imE  ChooaeSquadron) 

(ATTRIBOTE-BiKE  SqUlO) 

(PURPOSE  “selecting  a  squadron  for  the  aission’’) 

(ACTHIEVED  “selection  of  A  as-  the  squadron  for  the  nission’ ') 
(BOOT 

(non 

plane  (KB-FETCH  AIRCRAFT) 
base-list  (CB-FETCB  BASELIST)) 

(REPLY 

(KB-STORE  SQUAD  (squad-select  plane  base-liat)) 


Fig.  4.  MPA's  ChooeeSqusdron  Step 


In  the  end,  MPA  produces  a  list  of  attributes  for  the  mission  with  the  values 
it  decided  upon.  For  example: 

Aircraft  Type  A- 10 

.Number  of  Aircraft  4 
Squadron  118TFW 

Airbase  Sembach 


This  list  is  actually  a  menu,  from  which  users  can  select  any  value  and  ask  MPA 
to  explain  how  that  value  was  decided 

3.3  Explanatkm  in  MPA 

Ws  implemented  explanation  in  MPA  on  the  organising  principle  that  the  agent 
that  makes  a  decision  is  responsible  for  explaining  it.  For  our  purpoees,  we  con¬ 
sider  the  thinp  we  have  described  -  specialists,  selectors,  sponsors,  plans,  and 
steps  (Fip.  2-4)  -  to  be  problem-solving  ageiiu  in  DSPL.  The  current  implemen¬ 
tation  of  MPA  contains  nearly  100  of  these  agents,  which  call  upon  ca^  other 
during  the  problem-solving  process.  All  of  these  sgenu  have  well-deAned  roles, 
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so  the  system  can  explain  an  agent’s  decisions  in  terms  of  the  goals  of  its  calling 
agent,  the  agent’s  own  role  in  the  pursuit  of  those  goals,  and  the  roles  of  other 
agents  it  called  upon.  To  do  this  we  added  slots  called  PUVOSB  and  ACHIEVED 
to  the  agent  definitions  in  DSPL  to  hold  text  strings  for  describing  the  agents' 
goals.  Then  to  explain  how  MPA  decided  on  a  particular  attribute  value,  the 
explanation  module  puts  these  strings  together  in  an  order  that  depends  on  the 
runtime  context  in  which  the  decision  was  made.  Given  such  an  explanation 
users  can  select  any  of  the  other  agents  and  ask  for  further  elaboration  from 
them. 

Suppose  a  user  selected  the  value  ’’IISTFW”  of  the  attribute  ‘‘Unit”.  The 
only  question  users  can  ask  MPA,  the  only  explanation  it  can  give,  is  a  form 
of  “How  was  it  decided?”  Thus,  the  user’s  selection  in  this  case  implicitly  asks, 
“How  was  it  decided  that  the  Unit  should  be  118TFW?”  The  explanation  is 
given  in  Fig.  5.  This  decision  was  made  by  the  CheossSqiaadroa  step  so  the 
explanation  comes  from  that  agent.  The  explanation  first  gives  the  purpose  of 
the  calling  agent  (shown  in  italics),  which  comes  from  the  A- 10  plan  in  this  case. 
Then  it  gives  the  values  of  the  local  variables.  Finally,  it  gives  the  value  it  chose 
for  its  attribute. 


The  context  of  eonsiJtritig  the  /easibilitf  of  an  A- 10  for  tho  mitsion  determined  that: 

-  plane  was.A*10 

-  base-list  was  Ramstelo.  Bitburg,  Sembaeh 
So  llETFW  was  an  appropriate  choice  for  Unit. 

Fig.  S.  Explanation  for  a  Step 


This  explanation  may  be  unsatisfying.  A  better  explanation  in  this  case  might 
be: 


Assuming  that  we  are  to  use  A- 10s  and  that  the  only  bases  available  are 
Ramstein,  Bitburg,  and  Sembaeh,  then  118TFW  is  the  only  unit  that 
flies  A-lOs  out  of  these  bases. 

Some  of  the  dilTsreaet  between  this  and  Fig.  S  is  the  quality  of  the  English  text. 
The  only  content  dilTerence,  and  content  has  been  our  focus,  is  in  connecting 
the  ■■nmptinns  (values  of  local  variables)  to  the  final  decision.  MPA  could  do 
this  bettee  if  the  final  decision  were  made  using  DSPL  rather  than  the  Lisp 
functiaa  Sfnnd -select.  A  slightly  improved  version  of  the  explanation  would 
appear  as  ui  Fig.  6.  Because  the  explanation  module  is  essentially  just  translating 
DSPL  coda  into  text,  the  quality  of  the  programming  alTects  the  quality  of  the 
explaaatioa.  This  is  a  little  bit  undesirable  but  also  unavoidable  in  a  rystem  that 
has  to  ttiplaia  itssif  uaiaf  only  its  own  problem-solving  koowledfs. 

Users  can  select  any  of  the  local  variolas  given  in  the  explanation  (i.e., 
plane  and  bane-list)  isr  further  elaboration.  For  example,  to  find  out  why 
plane  is  A>10.  This  would  result  in  getting  an  explanation  from  another  step. 
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Tke  context  of  considtring  the  /eoMthiiity  of  an  A- 10  for  the  muxion  determined  thnt; 

-  plane  wan  A>10 

-  bane-lint  wan  Rmanteia,  BUborg,  Sembnch 

-  uniU-with-AlO  wan  lltTFW 

So  118TFW  wan  an  appropriate  choice  for  Unit. 

Fig.  8.  Improved  Explanation  for  a  Step 


since  attribute  values  are  determined  by  steps.  Or  users  can  select  the  calling 
agent  for  further  explanation.  This  would  result  in  an  explanation  from  the  A-10 
plan,  shown  in  Pig.  7.  As  with  the  step  explanation,  the  context  comes  from  the 
calling  agent,  the  Aircraft  specialist  here.  The  bulleted  itenu  are  the  purposes 
from  the  called  agents.  Additionally,  the  explanation  shows  where  the  agent 
was  in  its  procedure.  In  this  explanation,  since  the  user  arrived  here  from  the 
ChoossSqoadroB  step,  the  plan  had  completed  the  AssigaAircraf  tTyps  step, 
was  in  the  process  of  doing  the  ChoossSrjitadroB  step,  and  had  yet  to  do  the 
AssignBass  and  Gsttaags  steps  and  complete  the  configuration. 


In  the  context  of  ejecting  an  appropriate  aircraft  for  the  mistion  I: 

-  AsMgned  A-10  as  the  aircraft  type. 

I  was  in  the  procsM  of: 

-  Selecting  a  sqsadras  for  the  miasioa. 
ssd  was  abosl  to  do  to  foOowing.- 

-  Selea  a  base  for  the  missias. 

-  Determine  the  raage  for  the  miasioa. 

-  Choose  a  coadgnratios  for  the  A-10  on  this  raMsion. 

Fig.  7.  Cxplasatios  for  a  Plas 


The  szplwiatioas  shown  hert  are  generated  from  explanatkMi  templaUs.  Each 
agent  type  has  a  standard  representation  form  from  which  we  deriv^  its  explv 
nation  template.  A  aimpiifled  version  of  the  standard  form  for  plans  is  shown  in 
Fig.  8  (the  simpUllcatioa  is  that  in  addition  to  steps,  plans  can  contain  DISX6I 
calls  to  snbspseialists  as  in  Fig.  3).  Figure  9  shows  a  corrsapoodingly  simplilied 
siplanation  template  for  plans,  assuming  that  it  is  entered  from  step  i.  Thus,  a 
plan’s  explanations  are  put  tt^sther  out  of  the  goals  of  its  calling  specialist  and 
the  goals  of  the  steps  it  calls. 

Putting  explanations  together  out  of  ’‘canned”  text,  the  way  MPA  doas,  is 
not  a  very  snphietirsted  method  of  text  generation.  Bonsever,  the  important 
point  hete  is  that  the  rolm  of  the  various  agenu  -  sperialiats,  plana,  steps,  etc.  - 
and  their  relatienshipe  deftae  the  kinds  of  thinp  thu  can  be  eaid  and  how  these 
thingB  go  togethet  to  mahe  eeasihle  explanations.  These  roles  and  rdationahips 
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(pum 

(lAIC  (plan  name)) 

(SPOISOR  (tponnor  name)) 
(PURPOSE  (purpoae  ftring)) 
(ACHIEVED  (achieved  itring)) 
(BOOT 

{»tep  1) 


(ftep  i) 


(step  »))) 


Fig.  8.  Standafd  Plan  Repcesentation 

In  the  context  of  (purpose  of  containing  specialist)  I: 

-  (achieved  string  from  step  1) 

-  (achieved  string  (ram  step  i  -  1) 

I  wan  in  the  process  of: 

>  (purpose  string  from  step  i) 
and  wan  about  to  do  to  following: 

-  (purpose  string  friim  step  •  •f  I) 

-  (purpose  string  from  step  n) 

FIf.B.  Plan  Explanation  Template 


we  defined  by  the  GT,  in  tbie  crm  Owifn  by  Plwi  Selectioa  wid  Refinement. 
We  have  more  work  to  do  on  developinf  a  taxonomy  of  POIPOSIb  for  the  various 
agenu,  and  then  showing  how  to  use  the  taxonomy  for  explaining.  Bowever, 
our  aim  for  MPA  was  to  demonstrate  that  GT  programs  provide  the  structures 
needed  to  geaerate  explanations  of  strategy  and  steps. 

4  ExpUnation  Bas«d  on  the  Logical  Structure  of  a  Tuk 

GTs  combine  tasks  with  appropriate  methods,  which  enablm  explanations  to 
show  how  strategie  ciemsMs  combine  to  achieve  the  task’s  mgior  goab.  Bow. 
ever,  as  described  by  Chaadnsekaran  (see  Sect.  3).  for  any  task  there  ace 
many  possible  mech^.  To  properly  ex^ain  how  a  program’s  knowledge,  atrat* 
sgy,  b^avior,  and  condusiens  relate  to  iu  problem-aolving  task,  we  need  to 
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separate  the  task’s  requirements  from  those  of  the  methods  that  perform  it.  For 
example,  one  diagnostic  goal  is  to  find  a  disease  that  explains  the  symptoms. 
One  method  would  produce  explanatory  hypotheses  using  disease  hierarchies, 
another  would  produce  them  using  causal  models.  Each  method  imposes  its  own 
requirements  and  has  a  distinctive  behavior,  but  both  serve  the  same  diagnostic 
subgoal  -  generating  explanatory  hypotheses.  An  explanation  should  relate  their 
behavior  to  their  subgo^  in  spite  of  the  detailed  differences  between  them.  So  it 
is  important  to  identiQr  ta^*  logical  structure,  independent  of  particular  solu¬ 
tion  methods,  to  be  used  in  designing  explanation  components  for  systems  that 
perform  them.  In  this  section  we  describe  Tanner's  work  on  task  explanation  in 
diagnosis  [36]. 


4.1  The  Logicskl  Structure  of  Diagnosis 

Diagnosis  is  usually  considered  an  abduction  problem  [12,  17,  20.  28,  29,  30]. 
That  is,  the  task  is  to  find  a  disease,  or  set  of  diseases,  that  best  explains  the 
symptoms.  Accordingly,  a  diagnostic  conclusion  is  supported,  perhaps  implicitly, 
by  the  following  argument: 

-  There  is  a  principal  complaint,  i.e.,  a  collection  of  symptoms  that  sets  the 
diagnostic  problem. 

-  There  are  a  number  of  diagnostic  hypotheses  that  might  explain  the  principal 
complaint. 

-  Some  of  the  diagnostic  hypotheses  can  be  ruled  out  because  they  are:  (1) 
unable  to  explain  the  principal  complaint  in  this  instance,  or  (2)  implausible 
independent  of  what  they  might  explain. 

-  The  diagnostic  conclusion  is  the  best  of  the  plausible  hypotheses  that  are 
capable  of  explaining  the  principal  complaint. 

This  argument  form  is  the  logical  structure  of  the  diagnostic  task.  It  can  be 
thought  of  as  a  means  of  justifying  diagnoses.  As  such,  it  suggests  specific  ways 
a  diagnostic  conclusion  might  be  wrong. 

Suppose  the  diagnostic  conclusion  turns  out  to  be  wrong.  What  happened  to 
the  true  answer?  That  is,  why  did  the  true,  or  correct,  answer  sol  turn  out  to  be 
the  best  explanation?  Based  on  the  logical  structure  of  diagnosis,  given  above, 
the  diagnostic  conclusion  can  only  be  wrong  for  one  or  more  of  the  following 


1.  There  is  something  wrong  with  the  principal  complaint.  Either  it  is  (1)  not 
really  present  or  does  not  need  to  be  explained,  or  (2)  incomplete,  there  are 
other  things  that  should  be  explained  by  the  diagnostic  conclusioo. 

2.  The  true  answer  was  not  on  the  list  of  diagnostic  hypotheses  thought  to  have 
the  potential  of  explaining  the  principal  complaint. 

3.  There  is  an  error  in  ruling  out. 

(a)  The  true  answer  was  ruled  out.  It  was  mistakenly  thought  (1)  to  be 
implausible  or  (2)  not  to  explain  the  data. 
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(b)  The  wrong  answer  (the  one  given)  was  not  ruled  out.  It  was  mistakenly 
thought  (1)  to  be  plausible  or  (2)  to  explain  the  data. 

■t.  There  is  an  error  in  choosing  the  best  of  the  plausible  explanations.  Either 
( I )  the  wrong  answer  appears  to  be  better  than  it  is,  or  (2)  the  true  answer 
appears  to  be  worse  than  it  is. 

The  source  of  these  errors  might  be  found  in  either  missing  or  faulty  knowledge 
as  well  as  in  various  problems  with  the  dau  itself. 

.Many  users’  questions  can  be  interpreted  as  attempu  to  ensure  that  the 
conclusion  is  correct.  Thus,  corresponding  to  each  source  of  potential  error  there 
is  a  class  of  questions,  each  seeking  reassurance  that  a  particular  kind  of  error 
was  not  made.  This  analysis  tells  us  that  if  we  build  a  knowledge-based  system 
and  claim  it  does  diagnosis,  we  can  expect  it  to  be  asked  the  following  kinds  of 
questions. 

1.  Is  the  principal  complaint  really  present  or  abnormal? 

2.  Does  the  principal  complaint  contain  ail  the  important  data? 

3.  Was  a  broad  enough  set  of  explanatory  hypotheses  considered? 

4.  Has  some  hypothesis  been  incorrectly  ruled  out? 

5.  Could  some  hypothesis  explain  a  finding  that  the  system  thought  could  not? 

6.  Was  some  hypothesis  not  ruled  out  that  should  have  been? 

7.  Is  it  possible  that  the  hypotheses  in  the  diagnostic  conclusion  do  not  really 
explain  the  findings? 

3.  .Might  the  hypotheses  in  the  diagnostic  conclusion  be  rated  too  high? 

9.  Has  some  hypothesis  been  underrated? 

Furthermore,  these  questions  express  the  only  reasonable  concerns  that  arise 
soUlf  kcsese  it  is  a  diagnosis  system.  We  are  not  suggesting  that  all  questions 
will  be  in  exactly  one  of  these  classss,  some  may  refer  to  many  of  these  concerns, 
others  are  not  specifically  about  diagnosis. 


4.2  Using  the  Logical  Structure  for  Explanation 

Any  diagnostic  syrtem  will  have  some  means  of  achieving  the  diagnostic  goals 
specified  in  the  logical  structure  given  above.  Otherwise  it  will  fail,  in  some  re¬ 
spect,  to  be  a  diegnastic  system.  The  diagnostic  question  classes  ace  derived 
Croffl  the  diagnoetic  goals,  so  the  first  step  in  building  an  explanation  component 
is  to  map  the  diagnoetic  question  classes  onto  the  program.  That  is,  each  ques- 
tion  dam  (say,  *b  it  pomibis  that  the  hypothems  in  the  diagnostic  conclusion 
do  net  leally  explaia  the  findings?”)  is  mapped  onto  the  the  part  of  the  sysum 
rmponsibli  for  achieving  the  corresponding  goal  (in  the  example,  the  part  that 
detimniaes  the  qonptoois  a  diagnostic  hypothesis  explains).  This  way  the  ques- 
tiona  can  be  answered  by  the  part  of  the  system  that  made  the  relevant  decisioaa 
to  explain  how  the  decision  hdps  achieve  the  goal.  In  order  for  this  to  work,  the 
explainer  needs  a  way  of  mapping  users’  questions  into  the  appropriate  question 
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User:  Wliat  Antibody  in  the  conclusion  expisins  the  following  test  result: 
( 164  Coombs  3-f ) 

Red:  The  finding: 

(164  Coombs  34-) 
is  explnined  by: 

AntiS 

Fig.  10.  An  Explnnntion  From  RED 


Let  us  briefly  consider  an  example  from  a  diagnostic  system  called  Red. 
In  order  to  give  blood  to  patients  who  need  it,  a  hospital  blood  bank  must 
select  compatible  donor  blood.  A  part  of  this  decision  involves  finding  out  what 
antibodies  a  patient  has.  Red  is  a  system  that  aids  in  this  antibody-identification 
problem.  This  is  a  kind  of  diagnostic  problem  since  the  data  is  a  set  of  test  results 
to  be  explained  and  the  antibodies  are  used  to  explain  them.  One  type  of  question 
that  people  ask  of  Red  is  what  antibody  in  Red’s  conclusion  explains  a  particular 
test  result.  This  question  is  an  instance  of  the  question  class  defined  by:  “Is  it 
possible  that  the  hypotheses  in  the  diagnostic  conclusion  do  not  really  explain 
the  findings?”  This  is  derived  from  the  potential  error  that  the  answer  given 
does  not  actually  explain  the  data.  This,  in  turn,  is  derived  from  the  diagnostic 
goal  of  explaining  the  data.  So  the  question  ( “What  explains  a  particular  test 
result?”)  is  directed  to  the  component  of  Red  that  chooses  antibodies  to  explain 
particular  elements  of  data.  It  produces  an  explanation  such  as  the  one  in  Fig.  10. 
The  “( 164  Coombs  34’)”  is  the  notation  for  a  test  result  and  “antiS”  is  shorthand 
for  “antibody  to  the  S  antigen”.  This  process  of  mapping  the  question  to  the  part 
of  the  system  that  can  answer  it  is  not  done  automatically  in  Red.  The  logical 
structure  of  diagnosis  was  used  in  building  Red’s  explanation  component,  but 
the  mapping  is  hard-coded  in  the  program.  Tanner  [30]  describes  explanation  for 
Red  in  more  detail  while  Red  itself  was  fully  reported  by  Jcsephson,  et  al.  [20] 
The  logical  structure  of  diagnosis  presented  here  is  a  comnton  view  of  the 
diagnostic  task  (12.  17, 20, 28,  29,  30).  Not  all  approaches  to  diagnosis  will  share 
this  view.  In  fact,  there  is  one  common  competing  view  -  diagnosis  as  description, 
i.e.,  the  goal  of  diagnosis  is  to  describe  the  patient’s  state,  not  to  find  a  cause 
for  the  symptoms.  But  if  users  and  systems  agree  to  a  logical  structure,  it  can 
be  used  to  develop  explanation  for  diagnosis  in  the  manner  we  describe.  The 
details  srill  change  if  the  modd  changes,  but  the  method,  and  the  idea,  of  using 
the  logical  structure  to  develop  explanations  remains. 

5  Knowledg*  Juatiiication:  Relating  Problem  Solving  to 
Causal  Modeb 

The  integratioa  and  use  of  causal  models  in  compiled  problem-solving  systems 
has  become  increasingly  prevalent.  Xplain  was  probably  the  first  system  to  pro- 
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vide  explanations  of  problem-solving  knowledge  by  showing  how  it  was  obtained 
by  compilation  from  other  knowledge  about  the  domain  [26,  35].  Our  work  on 
functional  representations  (FR)  [31]  is  similar  in  showing  how  to  compile  diag¬ 
nostic  programs  from  functional  representations  of  mechanical  devices.  Following 
on  this,  Keuneke's  work  [23]  showed  how  to  use  FR  for  justifying  diagnostic  con¬ 
clusions,  which  we  describe  in  this  section. 


S.t  Background 

Methods  that  carry  out  problem-solving  tasks  need  knowledge  of  certain  kinds 
and  in  particular  forms.  For  example,  establish-refine,  a  method  for  hierarchical 
claasihcation,  requires  knowledge  relating  descriptions  of  situations  to  descrip¬ 
tions  of  clssses  (see  Sect.  2).  If  knowledge  is  not  available  in  this  form,  it  must 
be  derived  from  some  other  knowledge.  We  mfer  to  this  derivation  process  as 
compilation  [31, 35,  IS],  and  to  knowledge  in  the  desired  form  as  compiled  knowl- 
edge  [7].  The  ‘'other  knowledge”  has  sometimes  been  called  deep  knowledge,  but 
it  is  not  necessarily  deeper  or  better,  only  less  compiled  relative  to  the  method 
that  needs  it.  The  compiled  knowledge  can  be  justified  by  referring  to  the  knowl¬ 
edge  from  which  it  was  compiled. 

As  with  any  compiled  knowledge,  compiled  diagnostic  knowledge  can  be  justi¬ 
fied  by  referring  to  the  compilation  process.  Diagnosis  also  admits  an  interesting 
variation  on  this  type  of  justification.  If  a  system  for  diagnosing  faults  in  a  me¬ 
chanical  device  is  compiled  from  a  causal  model  of  the  device,  then  its  diagnostic 
conclusions  can  be  related  to  observations  using  the  causal  model.  This  justi¬ 
fies  the  conclusion  and  validates  the  compiled  knowledge  that  produced  it.  The 
causal  model  could  be  used  to  perform  diagnosis,  and  systems  have  been  built 
that  do  this  (11,  27,  35j,  but  for  complex  devices  the  large  amount  of  causal 
information  makes  the  diagnostic  tarit  very  difficult.  In  most  diagnostic  systems, 
the  causal  knowledge  is  compiled  for  greater  expertise  and  optimum  diagnostic 
performance.  Then,  if  a  causal  story  can  be  put  together,  using  the  hypothesised 
diagnostic  answer  sa  a  focus,  we  get  the  advantages  of  both  worlds;  the  compu¬ 
tational  benefits  of  compiled  knowledge  to  obtain  the  diagnostic  answer,  as  well 
as  the  causal  model  to  validau  it. 

In  a  diagnostic  eontext,  given  the  symptoms  and  the  diagnostic  conclusion, 
Keuneke  showed  how  to  use  a  causal  model  to  justify  the  diagnosis  at  various 
leveloof  datail.  la  many  situatioas  a  similar  method  will  work  to  justify  individ¬ 
ual  rulaa  in  the  knowledfe  base.  Wick  [39]  developed  a  related  idea:  justifying 
a  eoacifioa  in  terms  of  the  standard  argumenu  used  by  domain  experts.  Both 
Wick  aod  out  work  using  FR  produce  justifications  by  reference  to  knowledge 
not  used,  perhaps  not  even  neeM,  to  produce  the  solution.  However,  one  impor¬ 
tant  diHhrenee  is  that  justifications  come  from  knowledge  that,  in  principle,  could 
be  used  to  compile  diagnostic  problem  solvers  while  Widi  is  not  committed  to 
any  particular  rristionahip  between  justification  knowledge  and  pr^lem-aoiving 
knowledge.  The  intent  of  Keuneke's  [33]  research  was  to  continue  efforts  in 
the  development  of  a  device-  and  domain-independent  representation  capable  of 


modeling  the  core  a^>ecU  of  device  understanding;  the  extended  goal  is  a  cogni¬ 
tive  model  of  device  understanding.  Although  this  work  was  driven  by  the  task 
of  explanation,  the  representation  was  designed  to  provide  the  basic  primitives, 
and  organisation  for  a  variety  of  problem-solving  tasks. 

The  E\iiictional  Representation.  Initial  efforts  to  generate  causal  juscihcar 
tions  [S,  6, 18,  24]  focused  on  enhancing  Sembugamoorthy  and  Chandraaekaran's 
PR  [31]  to  provide  arepresentaition  with  the  necessary  organization  and  primitives'. 

Functional  Representation  is  a  representational  scheme  for  the  functions  and 
expected  behavior  of  a  device.  PR  represents  knowledge  about  devices  in  terms 
of  the  functions  that  the  entire  device  is  supposed  to  achieve  and  also  of  the 
sequence  of  causal  interactions  among  components  that  lead  to  achievement  of 
the  functions.  PR  takes  a  top-down  approach  to  representing  a  device,  in  contrast 
to  the  bottom-up  approach  of  behavior-oriented  knowledge  representation  and 
reasoning  schemes  [10, 14].  In  PR,  the  function  of  the  overall  device  is  described 
first  and  the  behavior  of  each  component  (its  causal  process)  is  described  in 
terms  of  how  it  contributes  to  the  function*.  Pigures  11  and  12  illustrate  the 
top-level  representation  of  a  chemical  processing  plant. 

In  this  representation,  a  device's  /auction  is  its  intended  purpose.  Punctions 
describe  a  device’s  goals  at  the  device  level.  For  example,  the  function  of  a 
chemical  processing  plant  is  to  produce  a  certain  product.  It  has  components  for 
supplying  the  reactants,  stirring  the  substance,  extracting  the  product,  and  so 
forth.  But  generating  the  product  is  a  function  of  the  device  as  a  whole. 

Functions  are  achieved  by  iekavton  or  causal  processes.  In  the  chemical  pro¬ 
cessing  plant  example,  the  substance  is  produced  by  the  causal  sequence  of  (1) 
input  of  the  reactants  into  the  reaction  vessel,  (2)  allowing  reactant  contact,  and 
then  (3)  extracting  the  product  from  the  reaction  vessel.  In  short,  the  functions 
are  toAsf  is  expected;  the  behaviors  are  Ao«  this  expected  result  is  attained.  In 
PR,  behaviors  are  represented  as  causal  sequences  of  transitions  between  partial 
states  or  predicates  (e.g.,  (present  reactants  rxvessel)). 

The  device  is  represented  at  various  levels.  The  topmost  level  describes  the 
functioning  of  the  device  by  identifying  which  components  and  more  detailed 
causal  pro  ceases  ere  responsibie  for  bringing  about  the  various  state  tranaitioiis. 
If  a  transitioa  is  achieved  using  a  function  of  a  component,  the  next  level  de¬ 
scribes  the  functioning  of  this  component  in  terms  of  the  roles  of  its  subcom¬ 
ponents,  and  so  on.  Ultimately,  either  by  tracing  through  more  detailed  causal 
prorsesM  or  by  expanding  the  causal  processes  of  functional  components,  all 
the  Ainctioas  of  a  device  can  be  related  to  its  structure  and  the  function  of  the 
enoipononta  within  this  structure. 

Cahnadng  thn  l^wcSiooal  Rnprosoatation.  Early  cxplanatioo  work  [5]  sim¬ 
ply  used  the  FR  as  n  tool  to  answer  questions  such  as:  (1)  Why  is  this  device 

'  For  a  ■ere  cenest  tsnual  treatnMat  of  the  repteaeetatiea,  see  [It] 

*  A  feactiea’s  raesel  process  is  tepreMated  in  the  maduse  by  iu  caeaal  pracew  de- 
scriptien  (CPD). 
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Fig.  11.  Fnactioaal  Compoacat  Hierarchy 


needed?  (2)  What  subcomponenta  doea  thia  device  require?  (3)  What  doea  this 
function  accompitsb?  (4)  Why  or  where  is  thia  function  used  in  the  device?  (5) 
How  is  this  function  achieved? 

Later,  enhancements  to  the  FR  allowed  the  representation  of  state  and  behav* 
ior  abstractions  [25,  23].  Abstract  schema  specifications,  and  the  ability  to  make 
transitions  between  abstraction  leveU,  is  useful  for  providing  diflerent  levels  of 
explanation. 

For  example,  suppose  there  exists  a  aoiid/liquid  mixture  in  which  action  is 
being  taken  to  keep  the  solid  from  the  bottom  of  the  container.  One  might 
witnea  the  following  causal  loop; 


by  stirring 

- ) 

(solid  falls)  (solid  rises) 

( - 

by  gravity 

Here  aa  obsstvet  could  ibilew  the  loop  any  number  of  times,  but  somewhere 
one  takes  a  conceptual  jump  and  identifies  the  dynamic  process  (solid  fall,  solid 
rise,  solid  fall,  solid  rise...)  as  a  sisle  at  a  different  behavioral  level,  i.e.,  identi¬ 
fication  of  the  proessi  state  (solid  suspended).  In  doing  so,  one  is  identifying  a 
new  phenomenon;  the  observer  is  packaging  a  process  and  seeing  it  (rom  a  higher 
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Fig.  12.  CPD  oxidation  (or  Function  produce.acid  of  n  Chemicnl  Procetcing  Plant 


conceptual  viewpoint.  These  types  of  conceptual  transitions  are  commonplace  in 
one's  understanding  of  a  device's  behavior  -  especially  in  cyclic  or  repeated  be¬ 
haviors.  Nevertheless,  past  methods  of  behavior  abstraction  (detail  suppression) 
did  not  explicitly  address  the  representation  of  such  phenomenon. 

For  researchers  interested  in  building  models  of  devices  solely  to  predict  be¬ 
havior  at  a  given  level  of  detail,  these  abstractions  will  not  be  helpful.  Instead, 
these  abstractions  provide  the  ability  to  tell  a  higher  level  story.  Prediction  is 
not  driven  solely  by  constrainu  of  structure  and  low-level  processes,  but  can  be 
enriched  and  focused  by  knowledge  of  abstract  processes  and  the  inferences  they 
dictate. 

Additional  enhancements  include  establishing  a  taxonomy  of  function  types. 
Each  function  type  indicates  different  procedures  for  simulation,  different  func¬ 
tional  capabilities,  different  expectations,  and  thus  different  knowledge  specifi¬ 
cations  for  representation  and  explanation.  Function  types  include; 

1.  ToMako:  achieves  a  specific  partial  stale 

2.  ToMaintain:  achieves  and  sustains  a  desired  state 

3.  ToPrevent:  keeps  a  system  out  of  an  undesirable  state 

4.  ToCookrol:  gives  a  system  power  to  regulate  changes  of  state  via  a  known 
relationship. 

More  detaibon  the  knowledge  distinguishing  each  type,  explicit  specifications  of 
the  Ainetioa  types,  and  the  information  processing  dbtinctions  each  type  makes 
b  provided  in  [23,  22]. 

The  structure  of  the  functional  representation,  organised  around  functional 
packages,  provides  focus  through  which  simulation  and  the  identification  of  struc¬ 
tural  cause  can  be  determined  (i.e..  given  changes  in  function,  what  changes  in 
structure  could  be  hypothesised  to  account  for  them?).  Since,  at  some  level,  most 
problem-solving  tasks  dealing  with  devices  are  concerned  with  either  the  achieve¬ 
ment  of  function,  or  conaequcocas  of  the  failure  to  achieve  hinctioo,  a  functional 
description  and  reasoning  power  proves  useful.  The  use  of  the  representation  in 
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diagnosis  seems  especially  appropriate  since  diagnosis  centers  around  determin¬ 
ing  the  change  in  structure  that  resulted  in  some  malfunction. 

6  Causal  Explanation:  The  Problem  Statement 

To  illustrate  the  use  of  the  representation,  we  pose  the  following  problem;  Given 
a  set  of  observations  and  a  diagnostic  hypothesis,  attempt  to  construct  an  ex¬ 
planation  in  the  form  of  a  causal  story  which  starts  with  a  diagnostic  hypothesis 
and  ends  with  one  or  more  of  the  observations  to  be  explained.  In  the  follow¬ 
ing,  we  examine  how  a  functional  representation  can  be  used  for  this  purpose. 
Technical  definitions  of  a  few  terms  may  be  useful: 

Observations:  observable  states,  including  a  subset  which  are  malfunctions 
of  the  device  subsystems  or  components.  The  following  distinctions  about 
observations  are  useful: 

-  Symptonu:  abnormal  states  which  are  indicative  of  malfunctions  and 
trigger  the  diagnostic  process,  e.g.,  specification  of  a  drop  from  normal 
pressure. 

-  Malfunctions:  observations  which  sre  malfunctions,  e.g.,  specification 
of  a  faulty  pressure  regulator.  .Malfunction  observations  are  generally 
also  syinptoms. 

-  Observable  states  which  provide  information  about  the  device  but  do  not 
directly  correspond  to  abnormalities,  e.g.,  specification  of  temperature 
or  pressure  readinp.  Typically  in  a  complex  system,  a  larp  number 
of  observations  are  used  in  the  diagnostic  process  which  provide  focus 
for  the  problem-solving  but  do  not  necessarily  indicate  problems  (e.g., 
sensor  readinp). 

OUgnostic  Hypotbosos:  the  set  of  malfunctioning  componenu  or  missing 
(but  expected)  relationships  between  components.  Each  in  the  latter  should 
sooner  or  later,  manifest  itself  as  the  malfunction  of  a  subsystem  within 
which  the  components  lie. 

Causal  ExplanaCioa:  Normally  one  expects  a  diagnosis  to  causally  "explain" 
the  symptoms,  even  though  in  general  the  diagnosis  actually  should  explain 
ail  the  observations.  The  explanation  provided  here  takes  any  given  set  of 
observations  to  be  explained  and  tries  to  propose  a  causal  path  from  the 
diagnostic  hypothesis  to  these  observations. 

The  sxplaaatioo  sought  can  be  formally  stated  as  follows: 

diagnostic  hypothesis  —  xi  —  s, . . .  —  sy 

when  sneh  is  either  (1)  an  internal  state  which  is  causally  relevant  in  produc¬ 
ing  an  abesrvatioa,  but  is  itself  not  a  malfunction.  (2)  a  component  or  subsystem 
malAiaction,  or  (3)  an  observation  at  the  device-level.  The  explanation  system 
developed  in  this  work  produces  explanation  chains  where  the  membera  ate  lim¬ 
ited  to  the  last  two,  i.e.,  malfuactionaor  observations,  ee/css  the  causally  relevant 
internal  stau  has  been  provided  explicitly  as  a  stau  that  needs  to  be  explained, 
i.e.,  as  input  to  the  casual  explanation  system. 


8.1  Generating  the  Malfunction  Causal  Chain 

In  the  same  way  a  functional  representation  provides  an  organization  to  allow 
simulation  of  how  etpteied  functionalities  are  achieved,  it  can  also  serve  as  a 
backbone  to  trace  the  effects  of  not  achieving  certain  functions  -  thus  identifying 
potential  malfunctions. 

The  organization  of  a  functional  representation  gives  both  forward  and  back¬ 
ward  reasoning  capability,  i.e.,  it  can  trace  from  the  hypothesized  malfunction 
to  the  observed  malfunctions  and  symptoms  (forward),  or  it  can  trace  from  ob¬ 
served  malfunctions  to  the  hypothesized  malfunction  (backward).  Because  both 
the  observations  and  the  diagnostic  hypotheses  have  been  identified  once  diagno¬ 
sis  is  complete,  the  functional  representation  could  potentially  be  used  to  perform 
either  form  of  control.  This  section  provides  an  algorithm  which  demonstrates 
the  forward  simulation  potential*. 

Specifically,  if  device  A  is  malfunctioning,  then  devices  which  use  device  A 
(say  devices  B  and  C)  have  a  high  probability  of  malfunctioning  as  well.  Similarly, 
devices  which  use  B  and  C  may  malfunction,  etc.  The  malfunction  causal  chain  is 
achieved  through  the  following  algorithm  which  has  been  condensed  to  illustrate 
main  points. 

1.  -  Set  Observatiotta  to  the  symptoms  and  malfunctions  to  be  explained, 

-  Set  .MalfunctionList  to  the  hypothesized  malfunction  set  provided  by  the 
diagnosis, 

-  Initialise  MalfunctionObJect  to  an  individual  malfunction  in  this  set  (di¬ 
agnosed  hypotheses  and  their  relationship  to  observations  are  considered 
individually) 

2.  Find  all  functions  which  made  use  of  the  function  which  is  malfunctioning 
(.MalfunctionObJect),  call  this  set  PossibleMalfunctions, 

3.  For  each  element  in  Pos8ibie.Malfunctions  (call  the  specific  function  Poas.Mal) 
consider  the  significance  of  the  effect  of  MalfunctionObJect  on  the  function. 

-  if  no  effect  on  PossMal  then  remove  from  PossibleMalfunctions  -  Mal¬ 
functionObJect  is  not  causing  future  problems.  Consider  the  next  element 
in  Possible.Malfunctions. 

-  else  maintain  (.Malfunction  —  Malfunction)  explanation  chain;  .Malfunc- 
tionObJcct  is  now  known  to  cause  a  malfunction  to  PossMal.  Specifically 
MalfunctionObJect  —  PossMal  is  appended  to  chain.  Note  that  this  step 
will  ultimately  place  any  potential  malfunctiona  in  a  malfunction  chain, 
including  thoM  which  are  in  the  set  of  Observations.  Continue. 

4.  Check  the  states  in  the  causal  process  description  of  the  affected  Poasible- 
Malftinction.  Would  noncompleiion  of  these  statea  explain  any  «ymptom(s) 
in  Obasrvationa? 


*  Now  that  siace  the  explaaatien  generatios  mcchasisni  sass  expected  (sactioaalities 
and  their  causal  procsMsi  rather  thaa  all  behaviors  that  could  possibly  be  generated, 
the  proMem  space  is  boosd  aad  thus  focused 
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-  if  yes.  append  to  ExplainedSymptoms  and  print  the  chain  which  led  to 
this  symptom.  Complete  the  malfunction  explanation  chain  by  continu¬ 
ing. 

5.  Set  MaifunctionObject  to  PossMal.  (MalfunctionObject  «=  PossMal) 

6.  Repeat  process  from  step  2  until  all  symptoms  are  in  ExplainedSymptoms 
or  the  top  level  causal  process  description  of  the  device  has  been  reached. 

7.  The  Process  from  step  1  is  repeated  until  all  elements  of  .VfalfunctionList 
have  been  considered. 

Step  2  is  easily  accomplished  through  the  component  hierarchy  of  the  func¬ 
tional  representation  (example  in  Sect.  6.2).  Step  3  and  4  are  more  intricate  and 
involve  knowledge  of  function  type  and  the  achievement  of  the  intended  causal 
processes. 

For  example,  in  step  3,  to  determine  the  effects  of  a  malfunction  on  other 
functions,  one  must  connder  the  possible  consequences  of  malfunctioning  com- 
ponenu.  In  general,  the  malfunction  of  a  component  in  a  device  can  cause  one 
or  more  of  the  following  three  consequences: 

NOT  FVinctkm:  expected  results  of  the  function  will  not  be  present.  Given 
that  the  malfunction  is  not  producing  the  expected  resulu  within  the  causal 
process,  what  states  in  those  causal  processes  will  not  occur,  and  will  lack  of 
this  functionality  cause  the  malfunctions  of  functions  in  which  the  malfunc¬ 
tioning  component  was  used? 

Parsuneter  Out<of« Range:  expected  results  of  the  function  ate  affected,  but 
behavior  is  still  accomplished  to  a  limited  degree.  Sometimes  componenu 
may  be  considered  midfunctioning  yet  can  still  perform  the  behavior  (or 
value  of  some  substance  parameter)  to  the  extent  needed  for  future  use. 
N«w  Bahaviors:  the  malfunction  results  in  behaviors  and  states  which  were 
not  those  intended  for  normal  functioning. 

The  determination  of  whether  a  proposed  malfunction  can  explain  a  symp¬ 
tom,  step  4  in  the  explanation  algorithm,  can  be  established  by  a  number  of 
means: 

1.  Check  each  stau  in  the  causal  process  description  where  the  malfunctioning 
component  is  used  to  see  if  there  is  a  direct  match  between  a  symptom  and 
asf  aAieving  an  expected  state. 

2.  Check  to  soe  if  the  (unction  which  is  malfunctioning  has  an  explicit  mal¬ 
function  causal  procaas  description  and  if  the  symptom  is  included  therein. 

IS 

3.  Check  to  sse  if  side  effecu  of  the  functions  causal  process  description  refer 
to  the  qmptonis. 

4.  Cheek  each  state  in  the  malfunction  causal  process  description  and  its  pro¬ 
vided  clause  to  see  if  expected  states  point  to  general  concepts  or  generic 

'*  See  [23]  isr  kaewMge  of  faactioo  type  aad  detail  on  fnnctioaa  with  explicit  mal- 
faactiea  causal  proewsee. 
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classes  of  behavior  (such  as  leak,  flow,  continuity)  and  if  the  symptom  per¬ 
tains  to  or  is  explained  by  such  concepts. 

5.  If  the  malfunction  is  a  malformation,  i.e.,  the  malfunction  is  described  as 
a  malformation  of  a  particular  physical  component,  perform  deep  reasoning 
(e.g.,  qualitative  physics)  to  see  if  malformation  could  cause  the  symptom. 

The  first  three  are  implemented  and  currently  used  for  the  explanation  genera¬ 
tion:  the  means  to  perform  the  last  two  are  research  in  progress. 

6.2  Representation  of  a  Chemical  Processing  Plant 

This  section  provides  the  output  for  an  example  explanation  in  the  domain  of 
Chemical  Processing  Plants.  Reference  to  Fig.  11  (in  Sect.  5.1)  will  assist  the 
reader  in  following  the  causal  explanation  chains  given  by  the  algorithm.  The 
hierarchy  in  Fig.  1 1  shows  a  partial  representation  of  the  functional  components 
with  their  intended  functions  (functions  are  specified  under  component  names). 
The  top  level  function,  produce.  sctd‘*,  is  achieved  by  the  causal  process  oxxdatton 
shown  in  Fig.  12.  It  should  be  noted  that  the  function  hierarchy  is  generated 
given  the  causal  processes  used  to  achieve  functions  of  the  functional  component. 
For  example,  the  Chemical  Processing  Plant  uses  the  functional  components 
LiquidFeedSystem,  AirFeedSystem,  TransferSystem,  etc.  in  the  process  oetdstton 
which  represents  the  causal  chain  used  to  achieve  the  function  produce,  setd;  the 
TransferSystem  uses  the  functional  components  AirFeedSystem,  .MixingSystem, 
etc.  in  iu  causal  process  to  achieve  the  function  eztrsction,  and  so  on. 

Thu  Problem.  The  Coolant  System  (identified  at  the  right  of  Fig.  11  )  is  used 
to  provide  coolant  water  to  a  Condenser  so  that  it  can  be  used  to  transfer  heat 
from  the  vapor  in  the  Condenser  (see  Fig.  13).  Suppose  the  coolant  vater  has 
been  completely  cut  off.  A  diagnostic  system  has  concluded  that  a  malfunc¬ 
tion  of  the  function  promit.cool*nt  of  the  Coolant  System  explains  the  symp¬ 
toms  of  NOT  (present  product  external. container)  and  NOT  (temperature  rxves- 
sel  at.threshold).  Specifically.  MalfunctionObject  is  {prousde.coofsut  of  Coolant 
System}  and  the  Observations  to  be  explained  are  (NOT  (present  product  exter¬ 
nal  .container),  NOT  (temperature  rxvessei  at.threshold) }.  The  system  produces 
the  following  three  casual  stories. 

Casual  Story  1:  Geoaratioo  of  Causal  CooiiactioBS.  The  causal  process 
SupplyReactanU  uau  the  functions  rtlncvc/ifuid  and  LifuidCeacCfW,  in  ad- 
dilioa  to  the  LiquidFeedSystem  and  AirFeedSystem.  The  explanation  system 
feneratu  the  following: 

The  eyaptoa 

lOT  (preeeat  product  extersal.eoataiaor) 
is  eaplaiaod  by  the  felloviag  chain: 

"  The  add  ptoduced  m  a  solid,  letepbibalic  add. 
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lOT  proTid* . coolant  cnnaot 
anlfnnction  in  condonao  cnnainf 
naif nnct  ion  in  ratriovali^id  eanainf 
naif unction  in  LiqnidConeCtrl  canainc 
problana  in  bohavior  SttpplyEaaetanta 
which  ia  oaod  in  bohavior  oxidation  and 
indicatoa  nalfnnction  of  tho  top  lowol 
function  and  roaulta  in 
lOT  (proaont  product  axtomal. container ) 

The  followinc  ay^tona  aro  not  oxpXainod: 
lOT  (tanparatura  rzvaaaal  at.thraahold) 


The  idea  here  ia  (hnl  if  the  required  amount  of  ceactanta  ia  not  available, 
(he  product  ia  not  produced  aa  deaired  and  thua  can  not  be  retrieved.  The  ex- 
plaoatioa  ayatem  fenarataa  thia  chain  by  uaing  the  following  information:  Pr«- 
«tda.<na/an<  ciuaad  a  malfunction  in  coadeiue  becauae  it  cauacd  a  failure  in 
eandanaa’a  behavior.  A  malfbnctioo  in  condefue  cauaed  a  malfiinction  in  rw- 
Intaaiifid  bacanae  ita  aehievetnent  waa  required  to  attain  the  deaired  behavior 
for  relrieaeliterdL  Jlc(rvewelif«td  cauaed  a  mdfunetkm  in  f  ifntdCancDW  becauae 
it  wraa  needed  to  provide  the  preconditiona  for  LiftiC—eCtrl  and  it  preceded 
the  uae  of  IifnidCracCiff  in  the  behavior  Supply Reactaata.  SupplyReactanta 
waa  uaed  in  the  cauaal  proceaa  Oxidation,  Fig.  12,  to  achieve  the  aUle  (praaant 
reactanta  rxveaaei).  Thia  atale  waa  neceeaary  for  the  completion  of  the  Mavior 
and  thua  non-achievement  here  denotea  non-achievement  of  further  atatea  in  the 
behavior,  particularly  NOT  (praaant  product  extcmal.container). 
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Causal  Story  2:  The  Use  of  Side  Effect  Inspection.  The  explanation  sys¬ 
tem  continues  and  finds  a  causal  connection  for  the  second  symptom.  NOT 
(temperature  rxvesaei  at. threshold). 

The  sfapton 

lOT  (tesiperatare  rxvessel  at. threshold) 
is  explalaed  by  the  felloeiag  chain: 
lOT  provide . coolant  causes  naif unction 
in  condense  causinf  problMS  in  behavior 
roaoveheat  of  function  cool 


Since  cool  is  not  a  top  level  function  of  the  chemical  processing  plant,  the  trace 
continues  until  all  consequences  are  determined. 

The  sjnpton 

lOT  (tenperaturo  rzvessel  at. threshold) 
is  explained  by  the  folloving  chain: 
lOT  provide . coolant  causes 
naif unction  in  condense  causing 
aalf unction  in  cool  causing  probleas 
in  behavior  co^ensate. oxidation. so 
a  notable  side  effect  behavior  used  in 
oxidation  and  indicates 
lOT  (teaperature  rxvessal  at. threshold) 

The  folloving  syaptoas  are  not  explained 
(present  product  external. container) 

.Notice  that  this  explanation  identifies  that  the  symptom  was  observed  in  a  stde 
efftet  behavior  (compensation  for  effecu  of  the  reaction)  rather  than  a  behavior 
of  the  main  functionality  (production  of  acid). 

The  statement  of  which  symptoms  are  not  explained  indicates  those  that 
were  not  explained  in  the  specific  causal  chain.  A  final  statement  is  made  when 
the  system  has  inspected  all  pertinent  causal  chains  (as  seen  in  the  next  causal 
story). 


Causal  Story  3i  Usfaig  Subftmetion  Connaetioos  for  Causal  Focus.  The 
final  causal  is  achieved  via  causal  connections  obtained  specifically  through 
the  knowing  of  subfuactions.  In  iu  specification,  the  function  extrseftea  has  a 
pcevidad  clause  which  specifiea  that  solid  acid  slurry  must  have  the  proper 
eoaaistsncy  so  that  flow  through  the  extraction  tube  is  possible.  The  function 
S«MC0tteCtrl  is  present  in  this  device  for  the  sole  purpose  of  producing  these 
eonditioru  (or  csfrsefien. 

The  purpose  of  SahiCntCtri  is  to  keep  the  solid  suspended  and  the  eon- ' 
centratioo  in  the  reaction  vesati  at  the  proper  consistency.  In  the  Condensate- 
WithdrawalSyatem,  the  reincvefifetd  function  uses  the  Condenser  to  retrieve 
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the  coodenMte  from  the  vapor  produced.  The  MuturtLevelClH  function  then 
uses  a  feedback  controller  to  maintain  the  dow  and  thus  the  desired  amount  of 
liquid  in  the  reaction  vessel  -  which  ensures  that  the  acid  slurry  has  the  proper 
consistency.  If  the  liquid  is  not  retrievable,  then  obviously  the  condensate  flow 
cannot  be  controlled  and  consistency  of  the  uid  in  the  vessel  is  not  maintained. 
The  explanation  system  provides  this  explanatory  story  as  follows: 


Ons  fnnction  aifsctsd  by  provide. coolant 
is  SoXidCoacCtrl  sbiclt  is  a  necessary 
sabf unction  of  extraction 

The  sy^ton 

■OT  (present  product  external . container) 
is  explained  by  the  folloving  chain: 
lOT  provide . coolant  causes 
naif unction  in  condense  causing 
nalfunction  in  retrieval iquid  causing 
nalfunction  in  NixtureLevelCtrl  causing 
nalfunction  in  SolidConcCtrl  causing 
nalfunction  in  extraction  causing 
nalfunction  in  produce. acid  causing 
lOT  (present  product  external . container) 

All  syaptons  have  been  explained. 


4.3  Discussion 


The  intrinsic  limitations  of  a  functional  representation  for  explanation  arise 
from  its  intrinsic  limitatioiis  for  simulation.  The  representation  uses  prepackaged 
causal  process  descriptions  which  ate  organised  around  the  expected  functions 
of  a  device.  Simulatioiisof  malfunctioning  devices  are  thus  limited  to  statements 
of  what  expectations  are  '‘not”  occurring. 

This  limitatioo  effects  the  capabilities  for  explanation  in  two  significant  ways. 
First,  the  fuactioaal  wpiussntation  is  not  capable  of  generating  causal  stories  of 
malAiactioos  whidi  interact  unless  the  device  representation  has  this  interaction 
expikilly  rspiesentsd.  Similar  problems  regarding  the  interactions  of  malfunc¬ 
tions  arias  in  disgnnsis  (3^.  SecMdly,  “new”  behaviors.  i.c..  behaviors  which  are 
not  those  intended  for  normal  functioning  but  which  arise  due  to  a  change  in 
device  stcuctuie,  could  potentially  lead  to  symptoms  which  cannot  be  explained 
using  the  functional  representation.  Current  research  efforts  focus  on  how  a  func¬ 
tional  organisation  mij^t  be  used  to  determine  these  new  behavioral  sequences, 
in  addition  to  how  eonventiooal  methods  of  qualitative  reasoning  may  be  inte¬ 
grated. 


Additional  Applications  of  a  Functional  Model 

The  idea  of  considering  how  devices  work  is  a  generally  useful  concept  which 
provides  a  focus  for  reasoning  about  objects.  Since  goals  can  be  attributed  to 
many  types  of  objects,  a  general  representational  language,  focused  around  func¬ 
tionality,  can  potentially  model  an  understanding  of  a  variety  of  object  types, 
i.e.,  truly  a  “device-independent”  representation.  In  addition,  the  organization 
around  functions  helps  to  focus  a  reasoner’s  attention  toward  expected  goals; 
something  works  like  it  does  beczuiss  it  is  meant  to  achieve  a  specific  purpose. 
The  practical  uses  of  having  a  functionally  oriented  understanding  of  how  some¬ 
thing  works  can  be  seen  in  the  following  applications: 

diagnosis:  How  something  works  provida  information  about  what  functions  to 
expect  from  working  models,  and  thus  implicitly  knowledge  of  malfunction¬ 
ing  models.  This  helps  to  enumerate  malfunction  modes  and  to  derive  what 
observable  consequences  follow  for  a  given  malfunction, 
learning:  In  diagnosis,  if  a  hypothesis  has  been  made  and  a  causal  chain  cannot 
be  found  that  connecu  the  hypothesis  to  the  symptonns,  a  learning  process 
could  be  triggered.  Specifically,  a  diagnosis  which  cannot  be  causally  con¬ 
nected  to  the  symptoms  might  cause  suspicion,  not  only  about  the  diagnostic 
result,  but  also  a^ut  the  knowledge  used  in  the  diagnostic  process.  Use  of 
the  malfunction  causal  explanation  capabilities  can  help  explicate  erroneous 
malfunction  hypotheses  and  aid  in  pointing  to  alternatives.  [37] 
repair/repUeeznent:  Knowledge  of  how  a  device  works  indicates  knowledge 
of  its  teleology.  Replacement  with  objects  of  like  teleology  can  be  considered, 
design/redesign:  Knowledge  of  what  functionalities  are  desired  can  point  the 
designer  to  necessary  components.  [16,  19) 
plaunning:  The  representation  of  plans  (as  devices)  provides  an  understanding 
of  how  the  plan's  goals  are  achieved.  [5] 

determinatioii  of  optimam  use:  Knowledge  of  how  a  device  works  can  pro¬ 
vide  information  regarding  how  to  use  the  device  to  its  maximum  potential. 
ansJogy:  Organising  knowledge  of  how  one  object  works  provides  links  for  de¬ 
termining  how  a  similar  object  might  operate, 
prediction:  Knowledge  of  expected  functionalities  focuses  reasoning  for  deter¬ 
mining  what  will  happen  in  a  device.  [32| 
simuletioai  Simulation  of  expected  device  behavior  is  useful  for  problem  solv¬ 
ing,  in  particular,  design.  [34,  19] 

enplenetion:  Having  the  knowledge  of  how  something  works  allows  one  to  sim¬ 
ulate  and  explain  the  mechanism,  i.e.,  tutorial  purposes. 


8  Conclusion 

In  this  paper  we  have  surveyed  the  work  dom  at  the  Ohio  State  Laboratory 
for  A1  Research  on  knowledge  systems  explanation.  We  consider  the  explanation 
problem  to  have  three  aspects;  the  explanation  content,  the  form  of  presenta¬ 
tion,  and  the  manner  of  presentation.  We  have  concentrated  on  the  ex^anation 
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content,  which  we  see  ae  having  four  parte:  explaining  problem-solving  steps, 
strategy,  and  cask,  and  justifying  knowledge.  Most  of  our  work  on  these  has 
been  guided  by  GT  theory  -  any  task  can  be  accomplished  by  many  different 
methods,  the  combination  of  a  particularly  appropriate,  domain-independent, 
method  with  a  task  is  called  a  generic  task.  GT  research  has  identified  several 
generic  tasks  and  a  knowledge  system  that  uses  a  generic  task  can  explain  its 
steps  and  its  strategy,  since  strategy  is  an  aspect  of  the  method.  By  combin¬ 
ing  generic  tuks  with  a  theory  of  tasks,  independent  of  method,  it  is  possible 
to  give  explanations  that  show  how  a  system’s  method  achieves  the  task  goals. 
Using  the  functional  representation,  also  developed  at  the  LAIR,  to  represent 
general  purpose  knowle^e  in  the  knowledge  system's  domain  we  can  justify  its 
problem-solving  knowledge  by  showing  how  it  was  derived.  Individually,  each  of 
the  efforts  described  here  solves  a  few  problems  and  leaves  many  issues  unad¬ 
dressed.  Taken  as  a  whole,  they  represent  an  attempt  to  explore  the  wide  range 
of  roles  that  knowledge  plays  in  explanation  -  knowledge  about  tasks,  methods 
and  strategies,  system  design,  background  domain  knowledge,  and  memory  for 
particular  problem-solving  instances  -  and  the  benefits  of  explicitly  representing 
these  kinds  of  knowledge. 
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Functional  Representation 
as  Design  Rationale 
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Although  a  design 
rationale  cannot 
be  completely 
represented. 
Functional 
Representation  is  a 
good  framework  for 
describing  causal 
components  because  it 
embodies  a  theory  of 
how  causal  stories  are 
understood. 


^  he  design  process  involves  exploring  design  spices,  simulating  and  veri¬ 
fying  candidate  designs,  and  possibly  redesigning  and  repeating  the  cycle. 
The  body  of  information  that  explicitly  records  the  design  activity  and  the 
reasons  for  making  choices  (and  reasons  for  not  making  some  choices)  is  called  the 
design  rationale  (DR).  As  more  of  the  design  process  gains  computational  support, 
some  designers  are  focusing  on  the  tasks  of  defining  the  components  of  DR  and 
casting  it  into  a  form  that  can  be  recorded  and  manipulated  computationally. 

Research  is  addressing  «hat  kinds  of  information  DR  should  contain  and  how 
to  express  it.  In  a  recent  special  DR  issue  of  the  journal  Human-Computer 
Interaction,'  MacLean  et  al.*  proposed  a  semiformal  noution  called  Questions- 
Options-Criteria  (QOC)  to  represent  the  design  space.  As  the  space  is  explored. 
Questions  identify  key  design  issues.  Options  provide  possible  answers  to  these, 
and  Criteria  help  assess  the  options.  Lee  and  Lai’  proposed  a  language  called  DRL, 
which  provides  a  vocabulary  for  recording  design  alternatives,  the  evaluation 
space  and  criteria,  and  the  argument  structure  in  which  design  discussions  are 
conducted. 

Lee  and  Lai  make  a  useful  distinction  between  using  DR  as 

(1)  a  record  of  the  exploratory  activity  of  the  design  team  (along  the  lines  of  the 
information  captured  by  the  QOC  formalism)  and 

(2)  an  account  of  how  the  designed  artifact  serves  or  satisfies  expected  func¬ 
tionalities. 

The  distinction  is  essentially  one  of  describing  the  designer’s  activity  (what 
alternatives  were  considered  and  what  choices  were  made  and  why)  versus 
describing  the  artifact’s  functioning.  We  consider  the  use  of  a  representation  called 
Functional  Represenution  (FR)  for  describing  how  the  device  works  (or  is 
intended  to  work).  Spedficnlly ,  we  wbh  to  show  how  FR  can  be  used  to  capture  the 
causal  component  of  DR.  By  that  we  mean  the  designer’s  (or  the  design  team’s) 
account  of  the  causal  interaction  sequence  between  device  components  that  leads 
to  achieving  device  functions. 

Some  of  the  tasks  for  whidi  DR  can  be  used  are 

(1)  ControUing  iiatributed  design  activity.  In  concurrent  engineering,  the  DR  for 
design  decisions  made  by  one  group  can  be  used  by  other  groups  to  avoid 
redundancy  of  effort  and  incompatible  design  choices. 

(2)  Reassessing  device  functioKi.  During  the  period  of  device  use,  the  compo- 
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aent  values  might  drift  or  even  change 
qualiutively.  Users  might  examine  DR 
to  see  whether  the  device  can  still  be 
expected  to  achieve  the  desired  func> 
tions.  They  might  also  examine  it  to 
evaluate  deviations  from  the  e]q>ected 
range  of  behavior. 

(3)  CtiuratiHgdiapioslic  knowledge. 
DR  can  support  the  generation  of  diag¬ 
nostic  procedures,  thus  helping  main¬ 
tainability. 

(4)  Simulating  and  verifying  design. 
DR  can  help  evaluate  whether  the  de¬ 
vice  will  perform  as  intended.  In  partic¬ 
ular,  DR  information  can  assist  in  fo¬ 
cusing  and  contrtriling  device  simulation 
to  verity  expected  functions. 

(5)  Redesigning  and  case-based  de¬ 
signing.  When  it  is  desirable  to  change  a 
device’s  function,  much  of  the  structure 
can  be  retained  if  the  intended  change  is 
not  drastic.  DR  can  help  identity  the 
conqwnents,  subsystems,  or  parameters 
nee«iing  change,  thus  avoiding  a  new 
design  from  scratch.  Similarly,  when  a 
new  device  is  being  designed,  previous 
designs  can  be  examined  for  similar 
functionality.  The  design  that  requires 
the  least  ii^ification  can  be  chosen 
and  modified  (case-based  design).  DR 
can  also  be  useful  in  identifying  the 
design  that  is  “dosest”  to  the  desired 
device  and  in  noting  where  this  design 
needs  ouxlification. 

The  FR  language  includes  elements 
for  capturing  DR  in  a  form  that  can  be 
helpful  in  performing  some  of  these 
tasks. 

Functional 

Representation 


As  suted  previously,  FR  is  a  repre- 
senutional  scheme*  te  the  causal  pro¬ 
cesses  that  cnhwinele  in  the  achieve¬ 
ment  of  device  funetfane.  (Not  aU  devices 
are  best  viewed  as  acUetdaf  their  func¬ 
tions  by  naeansaCeeaialpiaoeaaes,  whkh 
we  discuss  later.)  FR  takes  a  top-down 
approach  to  representing  a  device  in  the 
sense  that  the  overall  fhaction  is  de¬ 
scribed  first  and  the  behavior  of  each 
component  is  described  in  the  context 
of  this  function.  This  contrasts  to  the 
bottom-up  approach  taken  in  many  be- 
havior-orienied  knowledge-represenu- 
tion  and  reasoning  schemes.  (See  the 
sidebar.) 

Figure  1  is  a  schematic  of  a  device 
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called  a  nitric  acid  cooleri  (NAC).  In 
FR,  we  represent  the  structure  and  func¬ 
tion  of  a  device  and  the  causal  processes 
that  occur  within  it.  We  use  as  primitive 
notions  the  ideas  of  a  system,  its  input, 
and  its  output  behavior.  A  device  is  a 
system  with  some  intended  input-out- 
put  relations,  called  functions.  A  com¬ 
ponent  of  a  device  is  itself  a  system 
characterized  by  iu  functions. 

Qasses  of  functions  and  components 
can  often  be  described  by  use  of  param¬ 
eters.  In  the  NAC  example,  component 
class  “pipe(/,  d)”  describes  pipes  with 
length  /  and  diameter  d,  while  ‘‘pipe2”  is 


a  specific  instance  of  “pipe(/,  d),”  with 
specific  values  for  f  and  d.  Similarly,  the 
device  NAC  as  a  class  has  a  function 
‘‘oool-input-iiquid(rate,  temperature- 
drop),”  where  “rate  and  temperature- 
drop”  are  capacity  parameters  of  the 
function  “cooi-input-liquid.”  A  partic¬ 
ular  NAC  can  be  identified  by  specific 
values  for  these  parameters.  Devices 
can  have  substances  whose  properties 
are  transformed  as  part  of  ^eir  func¬ 
tions.  Substances  can  be  desuoyed  and 
new  ones  created.  They  can  be  physical 
(as  in  “nitric-acid”)  or  abstract  (as  in 
“heat”).  In  the  NAC  example,  the  sub¬ 


stance  “nitric-acid”  has  the  properties 
“temperature,”  “flow  rate,”  and 
“amount  of  heat”  (which  itself  is  a  sub¬ 
stance). 

Components  are  configured  in  spe¬ 
cific  structural  relations  to  each  other  to 
compose  a  device.  Components  thus 
have  ports,  at  which  they  join  other 
components  in  certain  relations.  In  an 
electrical  circuit,  components  are  “elec¬ 
trically-connected”  at  defined  terminals. 
In  the  NAC  example,  the  relations  in¬ 
clude  “conduit-connection,”  “contain¬ 
ment,”  etc.  (The  relational  vocabulary 
can  also  include  unintended  relations, 
such  as  “electrical  leakage  between  com¬ 
ponents.”  The  components  can  be  in 
unintended  relations  as  a  result  of  mal¬ 
functions.)  The  vocabulary  of  relations 
is  domain-specific.  The  relation  seman¬ 
tics  are  established  by  the  domain  laws 
that  govern  the  behavior  of  the  compo¬ 
nents  in  the  given  relations. 

Device  stincture.  A  device’s  struc¬ 
ture  is  a  specification  of  the  set  of  its 
components  and  their  relationships.  The 
components  are  represented  by  their 
names  and  by  the  names  of  their  func¬ 
tions,  which  are  all  domain-specific 
strings.  Variables  can  serve  as  compo¬ 
nent  parameters.  All  components  have 
ports  to  connect  with  other  coiiq>onents. 
For  example,  the  component  type  “pipe” 
might  be  written  as  “pipe(/,  d;  4)i” 
where  /  and  d  are  the  length  and  diam¬ 
eter,  and  (,  and  i,  are  the  input  and 
output  ports.  The  FR  language  uses 
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keyword!  (or  deKribing  structure,  as 
shown  in  ngure  2.  The  capitalized  key¬ 
word!  are  FR  (and  heoce  DR)  terma. 

Figure  3  deacribea  the  structure  of 
NAC  The  terms  in  italic  ate  domain- 
specific  names  (or  (nnedona,  oonqio- 
nenu,  relationa,  etc.  The  FR  interpreter 
treatt  them  u  strings.  Addition^  do- 
main-^>ecific  interpreters  may  be  writ¬ 
ten  that  can  use  the  italicized  terms  as 
meaningful  keywords.  For  example,  a 
mechanical  Simula  tor  can  use  mch  terms 
as  “contained^”  and  “conduit-connect¬ 
ed’*  to  perform  simulations.  For  the  pur¬ 
pose  of  this  exposition,  they  are  to  be 
understood  in  their  informd,  English- 
language  nieanings  The  Syntax  of  the 
Relations  keyword  is  that  an  a-ary  rela¬ 
tion  has  a  components.  The  Ports  tenn 
indicates  connected  ports.  Note  that  the 
components  are  described  purely  in 
terms  of  their  functions.  Components 
can  thus,  in  principle,  be  repla^  with 
structurally  different  but  tanctionally 
identical  components.  Further,  the  com¬ 
ponents  thenuelves  can  be  represented 
as  devices  on  their  own  terms. 

Stales  aud  partial  states.  A  device 
state  is  represented  as  a  set  of  state 
variables  consisting  of  values  of  all  vari¬ 
ables  of  interest  in  the  device  descrip¬ 
tion.  State  variables  can  be  continuous 
or  discrete.  In  particular,  some  of  the 
variablescan  take  the  truth  values(r,  f), 
that  is,  they  are  defined  by  predicates. 
An  example  of  a  continuous  variable  is 
“water  temperature”  in  a  device  that 
uses  water  (or  cooling  a  substance.  An 
example  of  a  variable  defined  by  a  pred¬ 
icate  is  “open?(vatve)”  (not  shown  in 
the  figures).  This  vari^le  ukes  Tot  F 
as  a  value,  depending  on  whether  the 
valve  is  open  or  shut 

In  describing  fnnctiocs  and  causal 
processes,  we  ganarafly  talk  in  terms  of 
partial  sutaa  of  the  davtaa.  Those  stales 
are  given  by  the  vataea  of  a  subset  of 
suta  variabka.  For  anopia,  the  partial 
suie  (can  it  “stmal”)  of  NAC  (describ¬ 
ing  aana  ralevaat  state  variablaa  at  the 
input  pi  of  the  daviea)  can  be  given  as 

(snbauaoe:  nitric  add;  loceiioo 
(substaaoe):  pl.tamperature 
(substaaoe):  71|. 

“State2,''  doacribiag  the  properties  of 
nitric  add  at  loeatkmpl,  dttars  only  in 
the  locution  parameter,  while  the  par¬ 
tial  suie  das^tioa,  “tutc3,”  at  loca¬ 
tion  p3ia 
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(substance:  nitric  add;  location 
(substance):  p3,temperature 
(substance):  72), 

where  72  <  71.  The  language  in  which 
states  ate  represented  is  not  part  of  FR 
and  is  largely  domain  specif  In  eco¬ 
nomics,  the  sute  variables  would  be 
entities  such  as  “CNF*  and  “inflation- 
rate”;  in  nuclear  plants,  the  entities 
would  be  “radiation-level,”  and  so  on. 
Goel’  defined  a  state-description  lan¬ 
guage  useful  for  devices  dealing  with 
material  substances  that  change  loca¬ 
tions,  such  as  those  in  which  substance 
flow  is  a  useful  notion.  The  state  repre- 
senutkm  that  we  just  used  for  NAC 
employs  this  language. 

Causal  precess  deacriptioa.  The  intu¬ 
itive  idea  here  is  that  we  understand 
bow  devices  work  by  building  a  causal 
description  of  how  they  go  through  var¬ 
ious  states  until  the  desired  ones  are 
reached.  We  explain  the  causal  transi¬ 
tions  between  states  by  appealing  to 
component  functions  and  domain  laws 
(such  as  scientific  laws).  The  causal  pro¬ 
cess  description  (CPD)  is  a  directed 
graph  with  twodistinguishednodes,NM 
and  N^.  Each  node  in  the  graph  repre¬ 
sents  a  partial  deacriptioa  ol  a  device 
state.  corresponds  to  the  partial 
state  when  the  conditions  for  the  (unc- 
tion  are  initiated  (such  as  when  a  ssritcb 
is  turned  on).  corresponds  to  the 
state  when  the  (unctioa  is  achieved.  End 
link  represents  a  causal  oonnectioa  be- 
tsrecn  nodes.  One  or  more  qualifiers 
are  attached  to  the  links  to  indicaie  the 
conditions  under  which  the  transitimi 
will  take  place.  One  or  more  annota¬ 
tions  can  be  attached  to  indicate  the 
type  of  causal  explanation  to  be  given 
for  the  transition.  The  graph  can  be 
cyclic  but  mast  have  e  directed  path 
from  to  S^. 

Consider  the  NAC  example  again. 
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Let  nodes  “ststel,”  “state2,”  and 
“state3”  correspond  to  the  sutes  of  ni¬ 
tric  acid  at  the  input  to  “pipel,”  at  loca¬ 
tions  p2  and  p3,  respectively.  Figure  4 
dq>icts  the  CPD  graph  (without  any 
annoutions  or  quali&rs),  describing 
what  happens  to  the  nitric  acid  and  wa¬ 
ter  as  they  flow  through  the  chamber. 
We  have  represented  the  nodes  in  in- 
foimal  Engli^,  but  they  can  be  described 
more  formally,  similar  to  our  previous 
description  of  “statel.” 

We  have  identified  three  types  of 
annotation  for  explaining  a  causri  tran¬ 
sition:  ^>pealing  to  another  causal  pro¬ 
cess,  to  a  function  of  a  component,  and 
to  domain  laws  (so-called  first  princi¬ 
ples).  Let’s  elaborate  on  the  three  dif¬ 
ferent  types  of  knowledge  used  for  ex¬ 
plaining  a  causal  transition. 

(1)  By-CPD.  In  explaining  the  transi¬ 
tion  “water  heated  -»  steam  created,” 
we  can  point  to  the  causal  process  of 
“boiling,”  which  is  part  of  the  common- 
sense  knowledge  of  most  humans.  The 
agents  for  whom  the  explanation  is  in¬ 
tended  will  accept  the  explanation  if 
they  already  understand  the  process  or 
if  the  details  of  the  process  do  not  mat¬ 
ter  to  their  purposes.  If  the  details  do 
maner,  they  can  ask  for  a  more  detailed 
explanation  of  the  causal  transitions 
invtdved  in  “boiling.”  In  any  case,  this 
enables  the  process  explanation  to  be 
biemrehically  composed  from  other 
process  explanations,  shortening  ex- 
pluntioas  at  each  level  Once  “boil¬ 
ing”  is  understood,  it  can  be  reused, 
possibfy  after  instantiating  some  pa¬ 
rameters  (such  as  the  pressure  at 
which  boili^  is  performed  and  the  liq¬ 
uid  that  is  being  boiled),  whenever  it  is 
needed  to  explain  other  processes.  Hu¬ 
man  domain  expertise  conuins  know¬ 
ledge  of  a  large  number  of  such  causal 
processes  that  can  be  parameterized 
and  reused. 
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(2)  By~Function-Of-<componM>.  In 
an  electrical  circuit,  the  lUte  traasition 
“Switch(oii)  -»  Voltage  v  at  the  termi- 
nali”  might  be  explained  by  pointing  to 
the  function  of  the  battery  at  aaource  of 
voltage,  that  it,  by  the  annoution,  “By- 
Function(voltage  generation)  Of 
Component(battery).”  In  fact,  a  major 
goal  of  cautJd  expluation  it  to  detail 
device  behavior  in  tenns  of  component 
properties  and  interconnections.  Again, 
a  large  part  of  human  domain  expertise 
is  in  the  forming  of  knowledge  about 
generic  componen  ts  and  their  functions, 
even  if  the  details  of  component  func¬ 
tion  are  unknown.  The  ability  to  ex¬ 
plain  device  funaions  partly  in  tenns  of 
component  functions  and  to  explain  the 
latter  in  terms  of  subcomponents  helps 
form  functional/compooent  hierarchim 
in  e:q)lanatioa  and  design. 

(3)  By-Dottuui-LawK.. .  >.  Another 
formof  explanation  occun  through  ap¬ 
peal  to  domain  laws,  la  the  engineering 
domaia,  scieatifie  lam  provide  the  ulti¬ 
mate  explanation  for  dnvke  behavior. 
For  example,  the  aMn  transition.  "S 
volts  at  the  input  -e  2  ansperes  through 
the  load”  mi^  be  erplainerf  aa  ”By- 
Doaiaia-Law(Ohm'slaw:voltaga>car- 
rent  •  resistance).” 

For  a  particular  device,  any  realistic 
FR  description  tapers  off  at  some  level 
of  oomponeata  and  CPDs.  This  is  aa 
example  of  the  geaeral  iacomideteaess 
iaheraat  ia  FR  aad  OR. 

Thetuactianofadevioe(oracampo- 
neat)  is  explained  by  poiatiag  to  a  CPD. 
When  we  use  a  By-Fiinctioa  aaaou- 


tion,  we  are  actually  pointing  to  a  CPD 
thatpartlyexplainsthatfunction.  Thera 
are  two  differences  between  these  an¬ 
notations.  The  first  one  is  how  the  CPDs 
are  indexed.  In  the  case  of  a  By-CPD 
annoutioa,  we  are  pointing  to  a  piece  of 
prior  knowledge  explicitly  labeled  as  a 
causal  proceas,  while  in  the  case  of  a  By- 
Function  annotation,  we  are  pointing  to 
prior  knowledge  of  a  component  func¬ 
tion.  The  second  difference  is  that  a 
omnponent  function  may  not  have  a 
CPD  explicitly  available  to  explain  it 
Often  we  know  many  componenu  only 
by  their  functions. 

Sometimes  additional  noncausal 
links  must  be  added  to  arrive  at  the  pre¬ 
dicate  of  interest.  For  exantple,  for  an 
amplifier,  we  may  have  constructed  the 
CPD, 

Voltage  1  at  the  input  Volt¬ 

age  10  at  the  output, 

but  the  function  that  needs  to  be  ex¬ 
plained  might  be  "amplification  of  10.” 

^  ling 

can  be  used  to  arrive  at  the  node  ”am- 
plificaiion  of  10”  from  "Voltage  10  at 
the  output”  Such  links  can  be  used  to 
indicate  aa  mfwence  that  follows  from 
pradicaies  in  the  earlier  nodes. 

In  addition  to  aitnotatioas,  the  links 
may  have  qualifien  that  sute  condi- 
tioaB  under  whi^  the  transition  takes 
place,  la  FR,  the  qualifier  Provided(P) 
iadkatea  that  coition  P  must  hold 
during  the  causal  traasitiao  for  the  tran¬ 
sition  to  be  initiated  aad  completed. 
andIf(F)indicateathatconditionFmust 
hold  at  the  moment  the  causal  transi¬ 


tion  starts.  The  conditions  can  refer  to 
the  states  of  any  component  or  sob- 
stance.  Many  of  these  qualifies  are  even¬ 
tually  translated  as  conditions  on  the 
structural  parameters. 

FignreSshowsafnllyaiuiouted  causal 
transition  from  "sute2”  to  "state3”  in 
the  Nitric  Add  Cooler.  It  uses  two  func¬ 
tional  aimotations  and  one  domain-law 
annotation,  and  employs  conditions  on 
the  substances  and  structure  as  qualifi¬ 
ers.  The  qualifiers  indude  condtions 
on  the  properties  of  the  substance  (it 
should  be  a  "liquid  of  low  addity”)  and 
structuralconditions  ("the chamber  fully 
endoaes  pipe2').  Note  that  a  transition 
may  have  more  than  one  annotation  or 
qualifier. 

Ftmctlana  Every  device  has  intended 
functions.  Keuneke*  has  identified  four 
types:ToMake,ToMaintain,ToPrevent, 
and  ToControL  Formal  definitions  of 
these  functkm  types  have  been  devel¬ 
oped,'  but  for  our  purposes,  the  follow¬ 
ing  informal  ones  ^ould  suffice.  All  the 
function  types  above  except  ToControl 
take  as  argument  a  predicate,  say 
defined  over  the  state  variables  the 
device.  The  function  is  the  Trddake  type 
if  the  goal  is  to  make  the  device  reach  a 
state  in  which  P,  is  true.  After  that  sute 
is  reached,  no  specific  effort  is  needed 
to  keep  the  pr^cate’s  value  to  True 
(or  it  doesn’t  matter  what  sute  the  de¬ 
vice  goes  to  after  the  desired  state  is 
reached). 

A  function  is  type  Tr^ifaintain  if  the 
intention  is  to  take  the  device  to  the 
desired  sute  and  if  the  device  must 
causally  ensure  that  the  predicate  re¬ 
mains  True  in  the  presence  of  any  exter¬ 
nal  or  internal  disturbance  that  could 
change  the  device  state.  The  function 
type  is  ToPrevent  if  the  goal  is  to  keep 
P^from  being  true  andsome  active  causal 
procem  ia  the  device  must  ensure  it. 
(Logically,  Tt^ieveat  P  can  be  written 
aa  ToMaiataia  (Not  P).  but  there  are 
impoitaat  diffeteaoes  ia  practice.  Prag¬ 
matically,  a  daaignar  charged  with  en¬ 
suring  thatadeviea,  for  example,  doeea’t 
ax^oda  nses  knowledge  indexed  and 
oiganixed  for  this  purpose.  Preveatiag 
thiseKpkisioabyosiag,say,  athidtp^ 
ia  am  the  same  as  maintaining  some 
dynamic  state  variable  ia  a  range.) 

Tbs  taactian  type  ToControl  takes 
m  argument  a  ipacifiad  relatioa  v«  « 
/Kv,, . . .  vj  between  sute  variables  v„ 
v„ . . .  V.,  and  the  intent  is  to  maintein 
this  relatioaship.  That  is.  we  wish  to 
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ooBtrol  tiM  value  of  •  specific  variable 
u  a  fuaction  of  the  values  of  some  other 
variables. 

Function  F  thus  has  the  descriptive 
elements  shown  in  Figure  6.  Now  recon¬ 
sider  the  example  in  Figure  1.  Hot  nitric 
acid  goes  into  a  heat  exchanger,  ex¬ 
changing  heat  with  the  water  that  is 
being  punqied  in.  The  water  becomes 
hotter  white  the  acid  beccoMS  cooler. 
Figure  7  provides  the  functional  defini¬ 
tion  of  NAC.  The  complete  FR  is  given 
by  specifying  the  device  name,  its  struc¬ 
ture,  state  variables  of  interest,  func¬ 
tions  of  interest,  and  the  functioBal  tem¬ 
plate,  including  the  CPDs.  The  FR 
language  has  many  inqitementations. 
each  with  a  somewhat  different  syntax. 
We  have  used  a  composite  syntax,  cho¬ 
sen  mainly  for  expository  effectiveaess. 
We  have  suppreued  many  details  by 
giving  English-langnage  docriptions  of 
the  intended  information  within  paren¬ 
theses  or  curly  brackets.  For  example, 
we  say  “Qualifiers:  (appropriate  enclo¬ 
sures  of  pipes  in  chamber)”  in  Figure  S. 
CoeFprovides  a  deuited  syntax  for  rep¬ 
resenting  the  relevant  relationa  about 
pipes. 

Uaiag  FR  to  lepieeeBt  a  devfce.  A 

language  by  itself  does  not  fully  specify 
how  it  should  be  used;  therefoie,  a  few 
remarks  on  how  FR  should  be  used  are 
in  order.  First,  there  in  no  uniqae  FR  for 
a  device.  There  caabedttbfeaoaB  in  the 
ways  various  agMte  dmeribe  how  a  de¬ 
vice  worka.  The  dHknaeee  might  te- 
latetohowtheeiplMellimwiidecoiB- 
poaed,  what  beck^oumlkaowtedtB  was 
assumed,  and  the  iateaded  ases  of  the 
causal  explnaatioa.Seoood,  choices  have 
to  be  ma^  about  what  fhactioBS  should 
be  explicitly  repreaeated.  While  every 
device  has  iatended  toactloas,  there  are 
a  number  of  haplkit  fuactioas  (such  u 
a  design  specificatioa  of  a  TV  set  that 
does  not  eiplidtly  record  that  it  ii  not 
iateaded  to  explode). 

RaateeCPRapflcaMny.  White  fiiae- 


tional  r^resentation  as  an  idea  is  quite 
general,  the  set  of  primitives  discussed 
so  far  only  coven  a  subclass  of  devices. 
Not  all  devices  have  functions  that  have 
to  be  understood  in  terms  of  a  tempo¬ 
rally  evolving  causal  process.  For  exam¬ 
ple,  the  device  “chai^  has  the  function 
“to  support  the  seating  of  a  human.” 
However,  this  function  is  not  aceian- 
plished  any  causal  process  that  in¬ 
volves  sute  dianges.  Instead,  the  func¬ 
tion  is  achieved  by  the  chair's  shape. 
Bonnet*  has  identified  a  function  type 
called  ToAllow  to  account  for  these 
instancBs 

Second,  FR  discretizee  continuous 
causal  processes  into  node-to-node  tran¬ 
sitions  in  which  some  predicate  of  inter¬ 
est  changes  at  each  node.  There  are 
many  phenomena  for  which  explana¬ 
tions  are  given  by  simply  displaying  the 
behavior  of  a  continuous  function.  For 
example,  gear  teeth  smoothly 

can  best  be  explained  by  showing  the 
displaoameBt  functious  of  the  teeth. 

Third,  more  teaeard  ie  needed  to 
repteeent  functious  having  temporal  te- 
latioos  between  sutes  that  serve  m  an 
essential  part  of  the  functioB  definition. 
For  exampla,  an  electronic  sawtooth 
signal  generator  goee  through  a  certain 
series  of  sttta  tAuiges  when  charging 
and  discharging  its  output  cnpndtor.  The 
fuaetioa  tobea^ieved  oonespands  not 
to  the  device  being  in  a  single  desired 
state  but  toitt  repeatedly  cyding  through 
a  saquenoe  of  stmee,  with  each  state 
having  a  certain  duration. 

Rapraaenting  these  aapectt  of  a  caus¬ 
al  process  is  an  open  research  Imue.  On 


the  other  hand,  there  is  no  real  con¬ 
straint  that  the  devices  must  be  physical 
for  useful  FRs  to  be  construct^  for 
thenL  The  FR  framework  has  been  used 
to  represent  computer  programs  and  to 
reason  about  program  errors.*  FR  has 
also  served  as  a  vehicle  for  reasoning 
about  complex  physical  systems;  refer¬ 
ences  4. 6,7,  and  10  provide  some  exam¬ 
ples. 

FR  as  design  rationale 

White  FR  does  not  encode  available 
alternatives  to  a  choice  and  why  they 
were  not  chosen,  it  does  provide  s  par¬ 
tial  rationale  for  choices  made  about 
componentsaadtheiroonfiguration.  We 
say  partial  because  the  CPD  encodes 
only  the  directly  causal  tote  of  a  compo¬ 
nent.  Of  course,  there  could  be  other 
reasons  for  choosing  a  particular  com¬ 
ponent  value.  Abo.  the  tote  of  a  oompo- 
nrat  in  achieving  a  functimi  can  be  db- 
tribntad  over  several  transitioas  and 
difliereat  CPDs.  The  rationale  for  why  a 
component  was  chosen  might  actually 
reflect  these  multiple  oonstrainu.  In  an 
electrical  dreuit,  the  real  expianation 
for  why  a  resistance  wm  dwaen  to  be 
1,000 ohms  might  be  that  the  resbtance 
bad  to  be  tem  than  24)00  ohms  for  a 
specific  tmaritlon,  greater  than  600  for 
another  trensitloa,  and  standardized  at 
14)00  ohms  for  easy  proenrameaL  Be¬ 
cause  the  latter  iafonnatioa  b  not  part 
of  the  cnasal  aceonat,  it  b  not  directiy 
repreaeated  in  FR. 

We  earlier  idaadfied  a  number  of 
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talks  that  OR  should  be  able  to  support: 
diagnostic  knowledge  generation,  sim¬ 
ulation,  design  verifieatkm,  and  rede- 
sign/caae-baaed  design.  FR  can  support 
many  at  these  tasks  to  the  degree  that 
they  require  causal  kaowledge  about 
de^^  operation.  Ot  coarse,  aU  these 
tasks  can  also  benefit  from  intormation 
that  comes  from otherDRoomponenu. 
In  partkular,  the  redesign  task  often 
requires  a  knowledge  of  why  certain 
choices  were  not  made.  Additional 
kaowledge  from  other  DR  components 
could  be  useful  for  those  aspects  of  the 
task.  In  this  section,  we  give  brief  over¬ 
views  of  how  to  use  FR  for  diagnostics 
generation,  design  verification,  and  re¬ 
design. 


Fm  simplicity ,  let’s  first  consider  a  CPD 
in  which  each  transition  has  only  one 
annotatimi.  Consider  a  transition  in  a 
CPD  of  the  form 


By-Function  F-Of-Component  c 

- >Ht 

Suppose  the  device  is  in  the  partial 
state  A,,  that  is,  the  device  is  in  a  state 
that  satisfies  the  predicates  oorrespond- 
ing  ton,.  Suppose  we  test  the  device  and 
observe  that  it  fails  to  reach  itf  What 
conclusions  can  we  draw?  The  CPO  as- 
seru  that  the  device  goes  from  partial 
sute  n,  to  n,  because  of  the  function  Fof 
component  c,  and  therefore,  we  can  hy¬ 
pothesize  that  the  failure  to  reach  n,  is 
due  to  component  c  not  delivering  func¬ 
tion  F.  Corresponding  to  this  transition, 
we  can  identity  a  possible  malfunction 
sute  **Compoaent  c  not  delivering  func¬ 
tion  F.”  The  diagnostic  rule,  ’’devbe 
satisfies  n,  but  not  n^”  can  be  need  to 
establish  this  malfuctian  mode.* 

If  the  annottrton  hnd  itend  been 
“By-CPD  CPD-1,”  wkiM  CVD-1  is  a 

inn  CPD-1  to  sen  why  llrii  tnuiitkn 
failed  (a  transition  in  CPD-l  mint  htvn 
failed  if  the  transition  from  fH  to  n^ 
failed),  intimately,  we  can  identify  the 
component  function  responsible  for  the 
fhilare  of  the  deviea. 

There  is  no  malfunetioaooiTespond- 
isg  to  a  transition  with  the 
By-Domain-Lnar,  a  domain  law  cannot 
tail  to  hold.  Ot  oowan,  the  designer's 
aeeoaat  of  the  role  piaynd  by  the  do¬ 
main  law  could  be  taoerreet,  but  wa  are 
assamiagherethattheFRilielfiseor- 
met.  Hew  to  verify  the  PR  is  an  iMet«B^ 


Not  all  difignoBtic 
kiuntdedge  can  be 
derived  from 
design  Infimnation 
altme. 


ing  issue  in  its  own  right  but  is  not  the 
subject  under  discussion. 

Tile  technique  trf  identifying  a  com¬ 
ponent  malfunctioaeither  directly  from 
the  annotttion  By-Fhnctioa  or  by  re¬ 
cursive  ^iplication  of  By-CPD  leads  to 
a  diagnomic  tree  with  leaf  nodea  that 
are  malfunctions  of  components  or  sub- 
cniiiponents.Tliediagnostictule  for  each 
malfunction  consists  of  rules  of  the  form 
“If  the  predkateeoorresponding  to  node 
n, are  true,  but  those  corra^onding  ton^ 
are  not  true,  then . . . .” 

What  h^ipens  when  we  have  more 
than  one  annotation,  as  in  Figure  S, 
where  the  transition  is  ezplained  by 
appealing  to  mom  than  one  function? 
In  this  ezanqile,  the  transition  can  fail 
when  “pipe2“  is  blocked,  or  has  a  fail- 
um  in  its  thermal  (heat-eschange)  func¬ 
tion,  or  when  the  conditions  in  the  qual¬ 
ifiers  am  not  satisfied.  In  this  case,  the 
transition’s  failum  can  onfyidBatiiy  them 
as  possible  malfunctions  It  cannot  es¬ 
tablish  them  becaum  additional  infor¬ 
mation  is  necessary. 

Note,  however,  that  not  all  diagnostic 
knowledge  can  be  derived  from  design 
information  alone.  For  esample,  rank- 
ordering  diagnostic  hypothesm  in  terms 
of  likelihood  and  pursuing  them  in  the 
order  of  nsost-to-leaot  probable  is  quite 
common  in  reasoning.  But 

thiBOfdeti^roqairm  knowledgeof  tail- 
om  prababilhim  for  components.  This 
infoimatioaisnotderivablebomacBns- 
al  OBodel  of  how  e  devim  works.  Addi¬ 
tional  information  in  the  form  of  faihue 
mtm  is  nesdsd.  Conversely,  aotaUdi- 
ngaostkknowledgederivedfromcnns- 
al  models  is  dboctiy  usable,  since  some 
variebtas  mentioned  in  the  dtagaostic 

not  be  directly  ebmtvable.  Additional 
infamaeo  may  be  mqnired. 

DssIpivutllonScmBycotnpnfingthe 
behnvion  pmdlcssd  by  shnnlntion  and 
the  FR  of  the  desired  device  taaction. 


one  can  verify  whether  the  design  will 
achieve  its  functions  in  the  intended 
fashion.  There  are  two  potential  diffi¬ 
culties,  however,  in  using  FR  and  simu¬ 
lation  resultt  for  design  verification. 

First,  there  is  a  poasible  gap  in  leveb 
of  abstraction  between  the  language  used 
in  the  function  specification  in  FR  and 
the  in  which  the  simulated 

behavior  is  repremnted.  For  ezample, 
the  behavior  of  traasiitots  and  resistors 
may  be  simalated  in  terms  of  currents 
and  voltagm,  but  the  function  of  the 
drenit  as  a  whole  might  be  described  as 
an  oscillator  or  as  an  adder.  To  make 
verificatkm  posstale,  one  muat  ensure 
that  each  abstract  concept  used  in  the 
functional  specification  is  clearly  de¬ 
fined  in  terms  of  the  concepts  employed 
in  «iimii«riag  the  b^avior. 

Seomd,  the  model  used  for  simula¬ 
tion  may  cmitain  details  irrelevant  to 
verifying  the  functica  of  interest,  or  it 
may  not  contain  all  the  relevant  infor¬ 
mation.  For  ezample,  a  pipe  as  a  com¬ 
ponent  can  be  moiled  as  a  oondnit  for 
Quid  flow,  as  an  object  with  thermal 
propertiea,  or  as  both.  If  the  function  to 
be  verifiedconcerns  only  flow, we  would 
use  this  information  to  construct  an  ap¬ 
propriate  sannlation  model  that  is  as 
sinqile  as  possible  while  containing  all 
relevant  aspects. 

The  CFD  can  be  used  as  follows  in  the 
design-verification  task.^The  predicates 
that  appear  in  the  CPD  definitions  of 
the  nodes  and  the  functional  predicate 
Pf  are  terms  of  interest  at  the  device 
IcweL  These  predicatee  are  first  defined 
in  terms  of  objectt  and  predicates  that 
appear  in  compnent  definitions.  For 
example,  suppose  the  predicate  “Am- 

tion  of  a  CPD  node  and  that  the  compo¬ 
nent  behaviors  are  in  terms  of  volta^ 
and  enmnts.  We  first  define  the  predi¬ 
cate  in  intms  of  voltagm  at  the  input  and 
ontpntof  relevantoomponents.  Second, 
we  perform  the  abttulation.  Finally,  we 
attan^n  to  establish  that  statm  in  the 
dasnlatad  behavior  correspond  to  esdt 
CFD  node  and  taat  each  CPD  transi¬ 
tion  in  tact  oocnis  in  the  simalated  be¬ 
havior  in  the  iateadad  way. 

Consider  a  tiaasitian  from  n,  to  n,  in 
the  CFD  of  an  slscttical  drenit.  Let’s 
myitisaaaotatedmBy-FhaetioaF-Of- 
Componsntc.Snppomnt  ischaracter- 
iaadby  the  piudkita  “aniplification  at 
portp,  >  IS”  aadii|by ’'aa^lifleatian  at 
port  h  >  30,”  where  “ampitficaikta  at 
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port  x"  ia  defined  in  the  languaie  used 
in  the  limutation  model  as 

Voltage  ate/ Voltage  $tp^ 

To  verily  this  portion  of  the  CPD 
from  e,  to  fi^  we  search  for  estate  in  the 
simulated  behavior  where  the  predict¬ 
ed  values  of  the  voltages  at  p,  and  p, 
satisfy  the  condition  Cor  n,,  using  this 
definition  of  amplification.  If  such  a 
stale  is  found,  we  must  also  find  the 
same  (or  later)  state  in  which  the  condi¬ 
tion  Cot  n,  is  satisfied.  If  such  a  state  is 
found  as  well,  we  at  least  know  that  the 
situations  described  by  n,  andii|  actual¬ 
ly  take  place  in  the  simulated  behavior. 

However,  before  we  can  claim  verifi¬ 
cation  of  the  CPD  transition,  we  must 
show  that  the  realization  of  condition  n, 
was  caused  by  condition  Ml  and  function 
Fof  component  c.  If  component  c  had 
no  role  in  the  transition,  it  may  not  have 
been  needed. 

The  meaning  of  one  event  causing 
another  is  a  contentious  phikaophical 
issue.  However,  within  the  context  of 
any  one  modeling  and  simulation 
scheme,  one  can  define  causal  relations 
unambiguously.  Iwasaki  and  Chan- 
drasekaran' provide  such  a  definition  of 
causal  relations  in  the  context  of  a  par¬ 
ticular  model-foimulation  and  simula¬ 
tion  system  called  DME  (Device  Mod¬ 
eling  Environment),  based  on  the 
bo:tom-up,  behavior-oriented  approach 
(see  sidebar  again).  In  the  verification 
scheme  described  in  that  reference,  this 
definition  is  used  to  show  that  the  cc»- 
ditions  specified  by  a  CPD  node  and  by 
the  annoutiotts  on  the  tranrition  link 
out  of  the  node  play  a  causal  role  in 
achieving  the  condhion  specified  by  iu 
successor  node. 

Redaalga.  In  this  tank,  the  goal  is  to 
orodifytheartifMtaottatitmeelisame* 
what  diflorent  fimsdens.  As  previously 
noted,  if  the  rsqeind  changes  ia  Auc¬ 
tion  ate  drastie;  ladtayametBral  alter¬ 
ations  may  be  needed,  possatly  requir¬ 
ing  another  design  from  scratch. 
However,  if  changes  are  small,  redssiga 
can  be  accomplished  by  relattvefy  sim¬ 
ple  modificatiMS  to  the  existiag  struc¬ 
ture,  perhaps  by  perameirie  changes  u 
thecomponeatsandsubstancss  Weea- 
amine  the  role  of  DR  in  general  MtdFR 
in  psrtkMar  M  the  redaeign  problem, 
MMnaiag  that  the  roquitnd  cheagas  are 
parametik. 

To  accomplish  the  task,  we  need  to 
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solve  three  subtasks:  identifying  com¬ 
ponents  that  require  modification,  iden¬ 
tifying  needed  modifications,  and  veri¬ 
fying  that  these  modifications  produce 
the  desired  function  changes. 

Deciding  on  the  appropriate  modifi¬ 
cations  requires  additionid  knowledge, 
some  of  which  might  be  found  in  other 
parts  of  DR.  For  example,  if  DR  in¬ 
cludes  the  reasons  certain  design  op¬ 
tions  were  considered  but  not  chosen, 
this  information  might  be  relevant  to 
the  redesign  problem.  Regarding  verifi- 
catioa,  Pegah,  Bond,  and  Stkklen’s  use" 
of  FR  for  parametric  simulation  is  rele¬ 
vant  They  show  how  FR  can  be  viewed 
as  a  form  of  compiled  simulation  and 
suggests  ways  ia  which  FR  can  iacorpo- 
rau  information  about  device  behavior 
over  ranges  of  component  parameters. 
With  this  information,  one  can  straight¬ 
forwardly  derive  device  behavior  when 
component  parameters  are  changed. 

We  outline  how  Kritik,*  a  system  that 
performs  a  form  of  case-bas^  design, 
uses  FR  to  identify  candidates  for  mod- 
ificatioit  Suppose  we  want  to  modify 
NAC  to  cool  high-acidity  sulfuric  acid 
rather  than  low-addity  nitric  add.  First 
Kritik  compares  the  functions  desired 
of  and  deliveied  by  the  candidate  NAC 
design  and  notes  that  they  differ  in 

(1)  the  substance  to  be  cooled  (sulfu¬ 
ric  instead  of  nitric  acid)  a^ 

(2)  a  property  of  the  substance  (high 
instead  of  low  aadiiy). 

Since  the  substance  property  difler- 
eaca  occurs  in  the  funciioa  '‘cool  (low- 
addiiy)nitric  add,”  Kritik  uses  this  func- 
IKM  to  access  the  CFD  responsible  for 
it  A  fragneai  of  this  CPD,  the  tiaasi- 
tloa  from  "siaie2”  to  ”suia3,”  is  shown 
in  Figure  S.  Kritik  traces  through  this 
CPD,  chectiug  each  sute  transition  to 
dMOfoiae  whether  the  goal  of  reducing 
the  suiistance  property  difference  (**100- 
eddity -» high-addity‘)csn  be  achkvnd 


by  modifying  an  element  in  the  transi¬ 
tion.  Fot  exaaqrle,  in  the  transition 
“state!  -»  sute3,”  Kritik  finds  that 
“pipe!”  has  an  “allow*  function  restrict¬ 
ed  to  low-addity  substances. 

Kritik  has  a  typology  for  modifying 
device  components: 

•  the  parameters  of  a  conqmnent  can 
be  tweaked. 

•  the  modality  of  a  component’s  op¬ 
eration  can  be  changed,  and 

•one  component  can  be  replaced  by 
another. 

This  typology  correspondingly  gen¬ 
erates  the  modification  hypotheses  that 
“pipe!” 

•  allows  the  flow  of  high-acidity  sub¬ 
stances  in  a  different  parameter  set¬ 
ting. 

•  allows  the  flow  of  high-addity  sub¬ 
stances  in  a  different  mode  of  oper¬ 
ation,  and 

•  has  to  be  replaced  with  some  new- 
“pipe!”  to  allow  the  flow  of  high- 
addity  substances. 

Because  how  the  modification  is  cho¬ 
sen  does  not  directly  relate  to  DR,  we 
omit  that  discussion.  GoeP  details  how 
Kritik  handles  the  replacement  of  nitric 
add  with  sulfuric  add. 


Design  rationale  is  a  record  of 
design  activity:  options  that 
were  consider^  choices  that 
were  (and  were  not )  made  along  with 
leasouB  for  the  dedrions,  and  how  de¬ 
signers  satisfied  themselves  that  the 
device  would  work  as  intended.  We  have 
proposed  the  use  of  a  framework  called 
Functional  Representation  as  a  candi¬ 
date  for  capturing  the  latter  component 
of  design  rationale.  FR  encodes  the  de¬ 
signer's  account  of  the  causal  processes 
ia  the  device  that  cuhninata  ia  achiev¬ 
ing  its  function.  The  represeaution 
makes  explicit  the  componenu’ roles  ia 
the  causal  process.  FR  has  been  iaqtle- 
meated  in  several  versions  and  used  to 
represent  and  reason  about  a  number  of 
sysfBS,bntusingitiorepiesentaoMn- 
poeeni  of  design  mtionnle  is  noveL 
We  have  diacnssed  the  Hmitttions  of 
FR  as  ikiign  ratioHnIe  in  that  it  cap- 
tans  only  the  cuasul  component.  We 
knee  also  dtaeussed  the  Hatitations  of 
the  cunent  lepertotae  of  repreeanta- 
tiannl  pfhaitivus  ia  FR  for  capturing  the 
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causal  component:  They  are  restricted 
to  devices  that  achieve  their  functions 
by  means  of  causal  state  changes;  com* 
plea  temporal  relations  are  not  easy  to 
capture.  But  the  basic  framework  is 
extensible.  The  central  idea  that  makes 
FR  a  good  candidate  for  representing 
the  causal  component  of  design  ratio¬ 
nale  is  that  it  embodies  a  theory  of  how 
causal  stories  are  undentood.  This  idea 
will  continue  to  form  the  organizing 
principle  for  any  extensions  of  FR  as 
well. 

Design  rationale  is  not  only  many- 
faceted  but  also  intrinsically  open-end¬ 
ed.  There  is  no  real  sense  in  which  a 
design  rationale  can  be  completely  rep¬ 
resented.  Ultimately,  much  of  it  will 
appeal  to  shared  commonsense  knowl¬ 
edge  about  how  objects  and  pet^le  be¬ 
have.  What  aspectt  of  the  rationale  to 
make  explicit  will  depend  on  intended 
users  and  tasks.  ■ 
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Knovfieiige  Systems 

The  Role  of  Explicit  Representotioii  of  Design  Knowledge 


t.  ChMMlnsdMrqa,  Ohio  Shrt*  Uairnnhy 
WiMm  Smrortmrt,  Uahwfrity  of  SooHiof  CoHforoio 


■  HE  FOLLOWING  TWO  ARTICLES 
examine  explanation  in  expert  systems. 
The  unifying  idea  in  these  projects  is  of 
general  importance  to  explanation:  The 
more  explicitly  we  represent  the  knowl¬ 
edge  underlying  a  system's  design,  the 
better  its  explanations. 

Knowledge  and  methods  of  using  knowl¬ 
edge  are  the  fundamental  elements  of  knowl¬ 
edge  systems.  Much  knowledge-system  re¬ 
search  has  been  concerned  with  developing 
increasingly  explicit  represenutions  of  these 
elements  to  support  increasingly  sophisti¬ 
cated  techniques  for  knowledge  acquisition, 
system  building,  and  explanation.  From  an 
explanation  standpoint,  explicit  represen¬ 
tations  of  knowledge  and  method  enable  a  ! 
know  ledge  system  to  examine  its  own  struc¬ 
ture  and  produce  explanations  ftom  the 
same  structures  used  for  reasoning. 

The  idea  that  explicit  represenutions 
faciliute  explanation  was  recognized  early 
on  (for  example,  in  Mycin').  It  soon  be¬ 
came  evident  that  higher  level  strategies 
played  a  role  in  a  knowledge  system's 
ability  to  solve  problems.  Researchen 
began  to  explicitly  represent  problem¬ 
solving  strategies  (methods  or  plans)  for 
using  knowledge  to  solve  problems. 
Examples  here  include  Digiulis  Advisor.' 
MOX.’  and  Neomycin.* 


ExPLANAHONS  of  a  knowledge  system’s  CONaVSIONS 
CAN  BE  AS  IMPOKUNT  AS  THE  CONaUSIONS  THEMSEmS. 

The  unifying  idea  in  the  next  mo  abjicles  is  of 

GENERAL  IMPORTANCE  TO  EXPlANmON:  ThE  MORE 

Expuany  we  represent  the  knowledge  underlying 
A  system’s  DESIGNy  THE  BETTER  ITS  EXPIANAHONS. 


Another  important  idea  for  knowledge 
systems  is  that  we  can  obtain  problem¬ 
solving  knowledge  from  other,  more  gen¬ 
eral  knowledge.  In  justifying  in  conclu¬ 
sions.  a  knowledge  system  might  need  to 
justify  the  knowledge  used  to  reach  them. 
This  in  turn  requires  access  to  the  more 
general  knowledge  that  produced  the  sys¬ 
tem's  knowledge.  (Although  such  general 
knowledge  is  sometimes  called  ''deep" 
knowledge,  or  "fint  principles.”  it  is  not 
better  knowledge;  it  is  simply  more  gener¬ 
al.  that  is.  useful  for  more  purposes  rather 
than  for  a  specific  problem  type.)  Knowl¬ 
edge  systems  based  on  explicit  repre¬ 
sentations  of  knowledge  and  method, 
with  information  about  bow  and  from  whK 
their  knowledge  was  obtained,  are  the 
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foundation  for  producing  good  explanations. 

Each  type  of  explicit  knowledge  makes 
specific  kinds  of  explanation  possible. 
Also,  each  such  representation  makes 
explicit  an  aspect  of  the  design  of  the  knowl¬ 
edge  system  itself.  For  example,  refnesent- 
ing  generic  problem-solving  methods  or 
strategies  is  a  way  to  make  explicit  the 
strategies  that  are  often  implicit  in  expert- 
system  knowledge  bases.  Similarly,  repre¬ 
senting  the  mote  general  knowledge  used 
to  derive  specific  pieces  of  knowledge  in 
(he  knowledge  base  involves  making  an¬ 
other  upect  of  the  design  explicit.  namely, 
where  the  knowledge  in  tte  knowledge 
base  came  from.  Thus,  the  operational  ptin- 
ciple  at  work  here  calls  for  increasingly 
explicit  design  knowledge. 
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Tasks  and  knowiodga 

Let  s  say  system  5  performs  a  task  T 
using  knowledge  K.  (We  would  normally 
use  St  and  AV  to  indicate  that  5  is  a  system 
that  solves  T.  and  that  K  is  the  knowledge 
related  to  T.  but  we  dispense  with  the 
additional  notational  complexity  here. )  For 
example.  .Mycin  is  a  problem-solving  sys¬ 
tem  1 5)  that  performs  the  task  of  consulting 
about  infectious  diseases  ( T)  using  the  rules 
in  Its  knowledge  base  (AD.  A  task  is  a 
col  lection  of  problem  instances  of  a  certain 
type.  Forexample.  .Vlycin  can  solve  a  num¬ 
ber  of  instances  of  consultation  problems 
in  infectious  diseases. 

A(5)  is  the  knowledge  needed  to  design 
5.  and  RiS)  is  the  design  record  ( the  record 
of  how  S(S)  was  used  to  design  5).  A(5) 
typically  includes  substantial  knowledge 
about  the  domain's  subject  matter,  the  na¬ 
ture  of  the  task,  the  range  of  appropriate 
>trategies.  the  architecture  of  5.  how  the 
parts  of  5  accomplish  T.  and  the  ongin  and 
range  of  applicability  of  K.  R<S)  would 
consist  of  the  vanous  design  documents 
recording  the  design  process,  in  the  Mycin 
c  X  ample.  .Mycin  lis  the  designer' s  knowl¬ 
edge  about  the  domain,  the  task,  and  AI 
architectures  i  most  of  which  is  never  made 
explicit),  while  /?( Mycin i  explicitly  cap¬ 
tures  a  small  pan  of  its  design  using  an 
informal  notation  (such  as  English  or  dia¬ 
grams  I.  Abstractly,  knowledge-systeiji  de¬ 
sign  IS  a  process  that  produces  5  from  A(5). 
and  RiSiisa  panial  record  of  this  process. 

By  ihe  nature  of  design  knowledge.  .^(  5) 
can  suppon  the  design  of  a  range  of  sys¬ 
tems  of  which  5  is  a  specific  instance.  For 
example.  iS(S)  might  contain  a  parametric 
fami  iy  of  strategies,  one  of  which  is  insian- 
iiated  in  5.  In  the  .Mycin  example,  (he  same 
underlying  knowledge  of  the  domain,  task, 
and  strategies  could  be  used  to  design  vari¬ 
ants  of  .Mycin  that  perform  different  ver¬ 
sions  of  the  task  or  the  same  task  in  differ¬ 
ent  ways. 

Much  expen-system  research  has  em¬ 
phasized  the  advantages  of  explicitly  rep¬ 
resenting  K.  In  fact,  expen  systems  as  a 
field  is  identified  by  its  explicit  representa¬ 
tion  of  K  and  its  application  of  various 
forms  of  inference  to  K  to  solve  problems. 
While  we  need  not  explicitly  represent 
.^(5mo  solve  T.  there  are  several  advantag¬ 
es  in  explicitly  representing  relevant  com¬ 
ponents  of  6l5u  including  how  knowledge 
m  them  was  used: 


•  Knowledge  in  A( 5)  and  /?(5i  about  strat¬ 
egies  and  about  K  lets  us  explain  the  behav¬ 
ior  of  5  and  justify  its  conclusions. 

•  By  operating  on  the  representation  of 
strategies  in  A(S).  we  can  generate  a  range 
of  behaviors  of  S. 

•  The  explicit  representation  in  A(S)  of 
knowledge  and  strategies  used  by  5  makes 
system  maintenance  easier. 

Of  course  A(S)  is  open-ended,  so  we 
cannot  explicitly  and  formally  represent 


Expuerr  representations 

OF  KNOWLEDGE  AND 
METHOD  ENABLE  A 
KNOniEDGE  SYSTEM  TO 
EXAMINE  ITS  OHN  STRUCTURE 
A\D  PRODUCE  EXPUSATIONS 
FROM  THE  SAME  STRUCTURES 
USED  FOR  REASONING. 

all  of  it.  However,  as  research  in  knowl¬ 
edge  types,  strategies,  and  architectures 
progresses,  we  will  have  an  increasingly 
rich  vocabulary  to  represent  more  and  more 
parts  of  A(5). 

Tha  OMompanying  articlos 

The  following  articles  present  results 
from  two  groups,  both  concerned  with  us¬ 
ing  knowledge  about  a  system's  design  to 
explain  it.  in  the  first  anicle.  .Michael  C. 
Tanner  and  Anne  M.  Keuneke  report  on 
work  at  Ohio  State  University,  which  has 
concentrated  on 

•  identifying  appropriate  strategy  abstrac¬ 
tions  (the  generic-task  work); 

•  exploring  the  relation  between  task 
requirements  and  strategies  available  in  5; 
and 

•  understanding  the  relationship  between 

diagnostic-task  knowledge  and  structure- 
function  models  of  the  device  being 
diagnosed. 

The  work  reported  in  the  article  focuses  on 
producing  strategic  and  task  explanations 
and  on  justifying  knowledge. 


The  second  article  comes  from  the 
Explainable  Expen  Systems  project.'"'  in 
which  William  Swanout  and  his  associates 
have  focused  on 

•  distinguishing  and  providing  explicit 
representations  for  several  kinds  of  know  1- 
edge  in  A(5).  including  a  domain  model 
and  general  strategies; 

•  using  an  automatic  program  writer  to 
create  an  explicit  design  record  /f(5i  that 
records  how  knowledge  in  Ai5)  was  ap¬ 
plied  to  create  5:  and 

•  capturing  the ''design"of  explanations, 
that  is.  what  the  system  was  trying  to  sa> 
and  how  it  was  trying  to  say  it. 

In  the  terms  used  above,  the  EES  project 
is  concerned  with  two  main  tasks:  the  task 
T  solved  by  (he  knowledge  system  5.  and 
the  task  of  constructing  explanations  £i  Ti 
of  £'s  performance.  Let  A(  £i  7D  i  denote  the 
design  knowledge  that  suppons  the  con¬ 
struction  of  the  explanation  £l  T).  and  let 
RlE(T))  be  the  record  of  how  that  design 
knowledge  was  used  to  construct  a  partic¬ 
ular  explanation.  R(E(T))  captures  what 
the  explanation  component  tried  to  convey 
in  a  panicular  explanation,  what  explana¬ 
tion  strategies  it  used  to  convey  it.  and 
what  alternative  strategies  exist  to  get  the 
same  points  across.  This  is  exactly  the 
information  needed  by  the  dialogue  com¬ 
ponent  to  understand  follow-on  questions 
in  context  and  to  correct  misunderstand¬ 
ings.  In  their  article.  Swanout.  Cecile  Pans, 
and  Johanna  .Moore  discuss  knowledge 
justifications  and  issues  related  to  present¬ 
ing  explanations  and  dialogue  with  users. 
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Ahctract  This  paper  focuses  on  the  task  of  design  verificaiion  using  both  knowledge 
of  the  structure  of  a  device  and  its  intended  functions.  In  paiMular,  it  addresses  the 
question  of  when  one  can  say  a  behavior  predicted  by  a  prediction  system  achieves  the 
desired  function  in  the  manner  intended  by  the  designer.  We  use  Functional 
Representation  (Sembugamooithy  A  Chandrasekaran  1986)  to  represent  the  function  of  a 
device  and  the  expected  causal  mechanism  for  achieving  it  We  present  a  formal 
definition  of  matching  between  a  system  tr^lory  generated  by  a  simulation  system  and 
the  description  of  a  causal  process  to  achieve  a  function  expressed  in  Functional 
Representation.  We  demonstrate  behavior  verification  based  on  die  definition,  using  two 
predicted  behavion  of  the  electiical  power  system  of  a  satellite.  We  believe  that 
evaluating  a  behavior  with  respect  to  the  expect^  causal  process  as  well  as  the  function 
improves  the  chances  of  uncovering  hidden  flaws  in  a  design  that  may  otherwise  go 
uidetBcied  at  an  eaity  stage. 


1.  Introduction 

Simolation  of  the  behavior  of  the  design  of  a  structure  is  an  important  mesis 
for  design  evaluation,  which  must  ascertain  that  the  design  achieves  the 
intended  ftinction.  To  achieve  a  robust  modeling  and  simulation  capability, 
a  system  must  be  able  to  compose  a  simulation  model  from  pieces  each  of 
which  may  be  tqipiicable  to  a  variety  of  situations.  At  the  same  time,  in  order 
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to  provide  a  useful  feedback  about  the  design  of  a  device  based  on  the  result 
of  simulation,  the  system  must  be  able  to  evaluate  the  predicted  behavior  with 
respea  to  the  knowledge  of  the  function. 

In  this  paper,  we  focus  on  the  task  of  design  verification  using  both 
knowledge  of  the  structure  of  a  device  and  its  intended  functions.  In 
particular,  we  address  the  question  of  when  one  can  say  a  bdiavior  predicted 
by  a  prediction  system  achieves  the  desired  function  in  the  manner  intended 
by  the  designer.  We  use  Functioiud  Representation  (Sembugamoorthy  & 
Chandrasekaran  1986)  to  represent  the  fun^on  of  a  device  and  the  expected 
causal  process  for  achieving  the  fimction.  We  will  presettt  a  formal  definition 
of  matching  between  an  expected  behavior  and  a  prediaed  behavior. 
Finally,  we  will  demonstrate  behavior  verification  based  on  the  definition, 
using  an  actual  example  of  behavior  predicted  by  a  simulation  system. 

1.1  BEHAVIOR-ORIENTED  AND  FUNCTION-ORIENTED  APPROACHES  TO 
MODELING 

Research  in  model-based  reasoning  about  physical  systems  has  emphasized 
representation  of  structures  and  reasoning  about  behavior  from  the 
knowledge  of  their  structures  and  physical  principles.  Several  model-based 
reasoning  systems  have  been  built  O^a^enhainer  &  FOrbus  1991,  Crawford  et 
aL  1990,  Iwasaki  &  Low  1991)  that  formulates  a  model  of  a  device  based  on 
its  structure  and  predia  its  behavior.  An  important  requirement  in  the 
approach  taken  in  these  systems,  which  we  shall  caU  the  behavior-oriented 
approach,  is  that  the  knowledge  is  stored  in  small  pieces,  each  representing  a 
conceptually  independent  physical  phenomenon  such  as  a  physical  process 
or  an  aspect  of  the  behavior  of  a  component  For  the  pieces  to  be 
composable,  each  of  them  must  be  defined  in  a  context-independent  manner 
as  much  as  possible  in  the  sense  tim  diere  is  no  unstated  assumption  about 
the  surroundings  of  a  component  or  the  function  of  the  whole  device. 
These  systems  predict  a  behavior  in  terms  of  a  sequence  or  a  graph  of  states, 
each  of  which  is  characterized  by  the  set  of  applicable  knowledge  pieces, 
implied  constraints,  and  variaMe  values. 

This  type  of  model-based  reasoning  capability  is  useful  for  a  system 
aimed  to  hdp  in  design,  since  it  allows  the  system  to  formulate  a  behavior 
model  automatically  and  to  simulate  its  behavior  so  that  the  designer  can 
discover  behavioral  implications  of  design  decisions  easily.  However,  an 
account  of  behavior  in  the  form  of  a  sequence  of  stales  must  be  evaluated  to 
be  useAil  for  further  development  of  the  design.  Does  the  predicted 
behavior  achieve  the  desired  function?  Does  it  do  so  in  the  way  the  designer 
intended?  These  are  crucial  questimis  in  providing  a  useful  feedback  to  the 
designer.  In  order  for  a  model-based  reasoning  system  to  answer  such 
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questions,  it  must  have  knowledge  of  the  function  of  the  device  -  WHAT  it  is 
supposed  to  do  --  and  the  expected  behavior  -•  HOW  it  is  supposed  to 
achieve  the  funaion. 

Functional  Represenution  (FR)  is  a  representational  scheme  for  the 
functions  and  expected  behavior  of  a  device.  FR  represents  knowledge  about 
devices  in  terms  of  the  functions  that  the  entire  device  is  supposed  to  achieve 
and  also  of  the  sequence  of  causal  interactions  among  compcxwnts  that  lead 
to  achievement  of  the  functions.  FR  takes  a  top-down  approach  to 
representing  a  device  in  contrast  to  the  bottom-up  approach  of  behavior- 
oriented  knowledge  representation  and  reasoning  schemes.  In  Functional 
Representation,  the  function  of  the  overall  device  is  described  first  and  the 
be^vior  of  each  component  is  described  in  terms  of  how  it  contributes  to  the 
function,  while  in  a  behavior-oriented  approach,  the  behavior  of  the  entire 
device  is  inferred  from  those  of  individual  components. 

In  order  to  evaluate  a  design,  one  must  be  able  to  predict  the  possiUe 
behavior  of  the  design,  as  well  as  to  determine  whether  the  predicted  behavior 
achieves  the  expected  functionality.  Verification  that  a  behavior  of  a 
designed  artifact  achieves  the  desired  goal  must  ascertain  the  followin'*' 

( 1 )  the  overall  function  of  the  device  is  achieved, 

(2)  the  expected  chain  of  events  happen  in  the  predicted  behavior,  and 

(3)  the  causal  cormections  expected  between  events  exist  in  the 
predicted  behavior. 

The  purpose  of  this  paper  is  to  investigate  this  concept  of  behavior 
verification  and  provide  a  formal  definition  of  behavior  verification  of  a 
design  with  respect  to  its  intended  functions  and  the  expected  causal 
processes  for  achieving  the  fimctions.  As  an  example  of  a  model-based 
reasoning  system,  we  use  DME  (Device  Modeling  Environment)  developed  at 
Stanford  University  Gwasaki  &  Low  1991).  Given  a  design  of  a  device, 
DME  formulates  a  computational  model  arxl  predicts  its  behavior. 

This  paper  is  organir^  as  follows:  In  Section  1.2,  we  will  briefly  describe 
DME.  In  Section  2,  we  formally  define  futKtions,  expected  behavior,  and 
what  it  means  for  a  predicted  behavior  to  match  an  expected  behavior. 
Section  3  presents  an  examine  of  behavior  verification.  We  conclude  by 
discussing  future  work  arxl  related  work  in  Section  4. 

12  ESVICE  MODELING  ENVIRONMENT 

DME  is  a  program  developed  by  How  Things  Work  project  (Fikes  et  al. 
1991).  The  goal  of  the  projea  is  to  provide  a  computational  envirotunent 
for  design  of  electromechanical  devices,  and  DME  is  the  device  modeling 
program  which  forms  the  core  of  the  environment.  Given  the  topological 
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description  of  a  device  and  initial  conditions.  DME  formulates  a 
mathematical  model  and  simulates  its  behavior. 

In  DME.  knowledge  about  physical  phenomena  is  organized  into  model 
fragments  in  the  knowledge  base.  Each  model  fragment  represents 
knowledge  of  a  conceptually  distinct  physical  phenomenon  such  as  a 
physical  process,  component  behavior  characteristics,  etc.  DME  takes  an 
input  description  of  the  initial  state,  including  the  topological  model  of  the 
device,  and  searches  the  kru.vledge  base  for  model  fragments  that  are 
applicable  to  the  given  situation.  Equations  to  describe  the  behavior  of  the 
device  are  formulated  ftom  the  constraints  associated  with  the  set  of  model 
fragmoits  thus  found.  The  equations  are  used  to  predict  the  behavior  of  the 
device  qualiutively  using  QSIM  (Kuipers  1986)  or  numerically.  During 
prediction,  if  there  are  any  changes  in  the  set  of  applicable  model  fragments, 
the  set  of  equations  is  up^ted  accordingly  and  pr^ctitm  continues  with  the 
new  equation  model. 

Some  model  fragments  represent  instantaneous  changes,  which  are 
phenomena  that  take  place  too  quickly  to  model  as  continuous  phenomena. 
Such  model  fragments  do  not  have  constraints  but  they  have  consequences, 
which  are  facts  to  be  asserted.  When  an  instantaneous  model  fragment 
becomes  active,  a  new  state  is  generated  immediately  to  follow  the  current 
state,  and  the  consequetKes  are  asserted  in  the  new  sute. 

A  model  fragment  m  can  be  interpreted  as  one  large  implication  of  the 
form  =>  Efn  or  Pm  ^m’  where  Pm,  Em>  ^  ^m  ‘Icnote  respectively 
the  conditions  for  the  applicability  of  tlu:  model  fragment,  the  behavior 
constraints  (equations  in  the  case  of  a  continuous  phenomenon),  and  the 
consequences  of  m  (in  the  case  of  a  discontinuous  phenomenon). 

Definition  1.  A  device  state  is  represented  as  a  set  of  state  variables  (V^) 
consisting  of  values  of  all  the  variables  of  imerest  in  the  description  of  the 
device.  State  variables  can  be  either  continuous  or  discrete. 

Definition  2.  A  device  trajectory,  T^,  represents  the  course  of  behavior  of  the 
device  over  time.  It  is  a  linear  sequence  of  states. 

2.  What  does  it  mean  to  verify  that  a  design  achieves  an  expected  ftinction? 

In  this  section,  we  define  what  it  means  for  such  a  simulated  behavior  to 
achieve  the  expected  behavior  represented  in  FR.  This  requires  introduction 
of  die  notions  of  a  causal  process  description  (CPD)  and  a  function  in  FR.  A 
CPD  is  a  causal  explanation  of  how  certain  states  of  interest  come  about  by 
exhibiting  a  sequence  of  causal  transitions.  The  transitions  are  annotated 
different  types  of  causal  explanation. 
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Definition  3.  A  Causal  Process  Description  (CPD)  is  defined  as  a  pair  {C,  G}, 
where  C  is  the  applicability  condition  and  G  is  a  directed  graph  G  -  {N,  L}. 
C  specifies  the  condition  under  which  the  device  is  expected  to  behave  as 
specified  by  G.  C  is  a  necessary  condition  for  applicability  of  CPD  but  not  a 
sufficient  condition.  Af  is  a  set  of  nodes  and  L  is  a  set  of  directed  links 
among  nodes.  Each  node  represents  a  p^al  description  of  a  state.  There 
are  two  distinguished  nodes  in  N,  the  initial  node,  and  the  final  node. 
Each  link  represents  a  causal  connection  between  nodes.  The  graph 
may  be  cyclic,  but  there  must  be  a  directed  path  from  Ni^it  to 

A  link  may  have  an  attached  qualifier.  By-fiinction-^-ofic),  where  c  is  a 
component,  to  indicate  the  conditions  under  which  the  transition  will  take 
place.  A  link  can  also  have  annotations  of  the  types,  Provide(p).  Ifip)  and 
Trigger(p),  where  p  is  a  wff,  to  indicate  the  type  of  the  causal  explanation  to 
account  for  the  transitioa 

In  order  for  us  to  be  able  to  relate  a  Tr  and  a  CPD,  we  require  that  each 
node  in  a  CPD  must  be  given  a  d^nition  in  the  form  of  a  m#  about  objects 
and  predicates  defined  in  terms  of  model  fragments  attributes.  We  let  defln) 
denote  such  a  definition  of  a  node  n.  For  example,  the  node  "Battery 
charging”  is  defined  as  dC/dt  >  0,  where  C  is  the  variable,  charge-level  of  the 
banery.  With  such  a  definition,  a  node  in  a  CPD  becomes  a  partial 
description  of  a  sute  in  Tr  using  attributes  defined  in  the  model  fragment 
library. 

Definition  3  mentions  qualifiers  and  annotations  that  can  be  anached  to 
the  links  between  the  nodes  in  CPD.  The  full  list  of  proposed  annotations 
can  be  found  in  (Sembugamoorthy  &  Chandrasekaran  1986)  and  (Keuneke 
1991).  For  a  link  ftom  nf  to  nj,  an  annotation  By-function-<f>-ofic)  means 
the  causal  interactions  going  from  ni  to  nj  must  involve  c  achieving  its 
junction.  The  purpose  of  a  qualification  is  to  allow  a  causal  transition  to  be 
explained  in  further  detail  by  CPD’s  of  component  c.  In  contrast,  qualifiers 
allows  one  to  specify  further  conditions  on  the  causal  tiansitioiL  Providedjp) 
means  that  condition  p  must  hold  during  the  causal  transitiott  lf(p)  means 
that  the  condition  p  must  hold  at  state  n^  Trigger(p)  means  that  p  must  not 
hold  before  but  must  hold  at  some  point  after  /ij  (inclusive). 

In  summary,  a  CPD  describes  a  causal  process  from  some  perspective  at 
the  device  level,  and  the  conditions  on  the  causal  transitions  and  explanations 
of  them  are  given  as  part  of  the  description.  Figures  1  and  2  show  examples 
of  CPD’s.  They  are  for  the  electrical  power  system  (EPS)  aboard  a  satellite, 
which  we  will  use  in  Section  3  as  an  example.  and  Nfi,^  in  each  CPD  are 

indicated  by  a  box  in  dashed  lines  and  a  box  in  thick  lines  respectively. 
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Definition  4:  A  function  F  is  defined  as  a  quintuple  {Typep,  PF<  Oevf^  Cf, 
Gf*;.  where 

Typep'  One  of  [ToMake,  ToMaintain,  ToPrevent,  ToControl). 

Pp"-  The  functional  goal,  i.e.  the  wff  that  the  function  is  to  make 
trae. 

Devf.  The  device  that  this  function  is  a  function  of.  This  has  to  be  a 
model  fragment  in  the  DMFs  knowledge  base. 

Cf\  The  condition  which  specifies  when  the  function  must  be 
achieved. 

Cf :  The  set  of  CPD's  describing  the  causal  mechanism  to  achieve 

the  fimction. 

Figure  3  shows  the  function  of  EPS.  We  consider  four  types  of  functions; 
ToMake,  ToMaintain,  ToPrevent.  and  ToControl  (Keuneke  1991).  Note  that 
a  device  can  have  multiple  functions,  in  which  case  each  function  will  be 
represeated  separately.  In  case  of  a  fimaion  of  type  ToControl,  Pf  must  be 
of  the  following  form: 

(-  vo/fv, ...  vj), 

where  V|'s  are  variables  and  /is  some  function  of  its  arguments. 


Conditions:  (Shining  Sun) 


1  Sun  Shining 


Provided:  solar  array  is  in  a  closed  drcuit 
((Closed-elecirical-loop  Sip)  n  (In-loop  S  A)) 

r  by-functUm-of  SA. 


"2  Electricity  production  by  solar  array 
(l2<0) 


Provided:  Battery  is  in  a  dosed  drcuit 
with  the  solar  array. 
((Ck)sed-electricaI-loop  Sip) 
n  (In-loop  SA)  n  (In-loop  BA))  ^ 


*3  Battery  being  charged 
(dCydt>0) 


Provided:  Load  is  in  a  dosed 
circuit  witti  dte  solar  array. 
((Closed-eleciricai-loop  Sip) 
r>  (bi-kwp  LD)  n  (In-loop  S  A)) 
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CondiUofiK  -(Shining  Sun)  o  (Active  Batteiy-over-cKazged) 

3  Battery  being  discharged 
(dC/dt<0) 

Provided:  Batte^  and  Load  are  in  a  dosed  circuit. 
1^  Charge-level  of  Battery  >  0 

((Closed-electrical-Ioap  Sip)  •  (In-Ioop  LD) 
_ t  f  n  (In-loop  BA)  «  C  >  ()) 

Current  supplied  to  load 

I _ Cl5>0) _ 


Figure  2:  CPD  2  of  EPS 

Given  a  devi(x  description  and  initial  conditions,  we  can  generate  a  Tr. 
Suppose  we  also  have  an  intended  functi<}n  for  the  device  and  associated 
CPD’s.  Intuitively,  we  would  like  to  say  that  the  device  achieves  the  function 
in  Tr  if  (1)  the  fu^tional  goal  is  achieved,  (2)  there  are  states  in  Tr  matching 
all  the  nodes  in  the  CPD  in  the  specified  temporal  order,  and  (3)  for  each 
causal  link  in  the  CPD,  there  is  a  causal  path  in  Tr  that  connects  the  cause  to 
the  effect  In  order  to  make  these  conditions  more  precise,  we  must  define 
the  concept  of  a  causal  path  in  a  trajectory.  Then,  we  win  define  what  it 
means  for  a  trajectory  to  match  a  CPD. 

For  the  rest  of  this  paper,  we  use  the  following  notation:  We  will  attach  [sj 
to  wffs,  model  fragments  and  variable  to  denote  the  following  axioms: 

p(s]:  A  wff  p  holds  in  the  state  s. 
mlsl:  The  i^nomenon  represented  by  m  is  active  in  s  . 

v[s]:  A  wff  that  asserts  the  value  of  y  holds  ins. 

We  will  use  notations  such  as  <.  >,  and  ^to  express  ordering  among  nodes 
in  a  CPD  and  states  in  a  trajectory.  We  write  “«/  <  /12”  where  nj  and  /12 
are  nodes  in  a  CPD  to  indicate  that  n/  is  strictly  causally  upstream  of  n2.  For 
states  5/  and  j]  in  a  trajectory,  “s/  <52”  means  that  5/  strictly  precedes  S2  in 
time.  Note  that  the  ordering  is  partial  for  nodes  because  a  node  can  have 
muldf^  incoming  and  outgoing  nodes.  Ordering  is  total  for  states  because  a 
trajectory  is  a  linear  sequence  of  sutes. 

Function 

ToMaintain  Pp:  (Powered  Load)  Cp :  T 
Devf:  EPS  Gp:  CTOi,  CPD2 

Figure  3:  The  finction  of  EPS 
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2.1  CAUSAL  DEPENDENCY  IN  A  TRAJECTORY 

We  now  present  the  deflnition  of  a  causal  depenttency  relation  between 
axioms  p  i  and  p2  trajectory,  Tr.  Intuitively,  we  say  p2  is  causally 
dependent  on  pj,  written  “pv  =»c  p2",  when  it  can  be  shown  in  Tr  that  pi 
being  true  eventually  leads  to  p2  being  tnie  in  Tr.  Before  we  define  the 
causal  dependency  relatimi  among  wfTs  more  precisely,  we  introduce  the 
notion  of  causal  ordering  among  state  variaMe. 

2.1.1  Causal  Ordering  Among  Variables.  Suppose  we  are  given  a  system  of 
variables,  and  suppose  some  set  of  equations  relate  the  values  of  these 
variables.  Equations  by  themselves  are  inherently  acausal  and  symmetric,  but 
even  when  people  represent  the  behavior  of  a  system  by  a  set  of  equations, 
they  often  perceive  directed  causal  relations  among  variables.  Causal 
ordering  theory  (Iwasaki  &  Simon  86)  is  used  to  reveal  causal  dependencies 
among  variables  in  a  set  of  equations  and  produce  a  grafrti  structure  that 
encodes  these  relations.  In  order  to  apply  the  procedure,  one  must  have  a  set 
of  independent  equations,  each  of  which  represents  a  conceptually  distinct 
mechanism  in  the  situation.  One  must  also  know  the  variables  which  are 
externally  comroUed.  Such  variaMes  are  called  exogenous  variaUes. 

Given  a  set  of  N  equations  which  satisfy  these  requirements,  the  first  step 
of  the  causal  ordering  procedure  is  to  isolate  all  the  subsets  of  variables 
whose  values  can  be  determined  independently  of  the  remaining  variables. 
Such  a  subset  of  variables  can  be  found  by  identifying  a  set  of  n  equations 
which  contains  exactly  n  variables  but  wluch  itself  does  not  include  a  proper 
subset  containing  the  same  number  of  equations  as  variaUes.  Such  subset  is 
called  a  minimal  complete  subset.  The  variables  in  any  minimal  complete 
subset  are  the  "uncaused  causes”  of  the  system,  and  they  are  causally 
independent  of  other  variables.  Each  exogenous  variable  constitutes  a 
minimal  complete  subset 

Next  the  equations  in  all  minimal  complete  subsets  are  removed  from  the 
original  set  of  equations  and  their  variables  are  also  removed  from  the 
remaining  equations,  producing  a  reduced  set  of  N  •  m  equations  in  N  •  m 
variables,  where  m  is  the  total  number  of  equations  (and  variables)  in  all  the 
minimal  comfdete  subsets.  Then,  a  new  independent  subset  of  variables  is 
determined  in  the  reduced  set  This  process  repeats  until  the  set  can  no 
longer  be  reduced.  For  each  equation  in  the  original  set  the  variable  that 
was  reduced  last  is  said  to  be  causally  dependent  apoa  all  the  other  variables 
in  the  equation,  and  a  directed  graph  can  be  generated  to  depict  the  causal 
depmidency  structure  of  the  entire  set  with  nodes  representing  variables  and 
links  representing  causal  dependency  relations  among  them.  Also,  for  each 
variable  v  in  a  minimal  complete  subset  we  define  D(v)  to  be  the  set  of  all 
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equations  in  the  set  In  other  words.  D(v)  is  the  set  of  all  the  equations  that 
directly  detennine  the  value  of  v. 

We  will  write  v/  -»c  ^2  when  v2  is  causally  dependent  upon  vy  according 
to  the  definition  of  causal  ordering.  When  a  minimal  complete  subset 
consists  of  more  than  one  variable,  the  causal  ordering  procedure  does  not 
impose  ordering  among  them  siiKe  such  a  situation  indicates  the  existence  of 
a  feedback  loop  among  them.  In  such  a  case,  we  say  the  variables  ate 
inurdependent  and  write  v/  4^^  v2. 

Even  though  the  above  description  is  given  in  terms  of  variables  and 
equatitms.  which  imply  domains  of  continuous  variables  and  function,  the 
conceit  applies  to  domains  of  continuous  as  weU  as  discrete  variables.  The 
“equations’*  in  the  case  of  discrete  vatiaUes  can  be  any  axiom  that  can  be 
used  to  determine  the  value  of  one  variable  depending  on  other  variables,  as 
long  as  such  an  axiom  represents  some  conceptual  mechanism  in  the 
situation.  For  example,  the  control  of  a  sprinkler  system  that  turns  on 
between  1  and  2  am  every  day  can  be  represented  by  an  axiom. 

1  ^  time  (On  Sprinkler). 

2.1.2  Causal  Dependency  Relations.  Given  the  causal  ordering  procedure 
we  can  now  proceed  to  defining  causal  dependencies  among  wlTs. 

Deflnltioa  5.  The  causal  dependency  relation,  denoted  ai  =>c  aj\  is  defined 
between  two  wfTs.  ai  and  aj,  in  the  descriptions  of  states  in  a  trajectory  Tr. 
We  write  “ai  =*c  <*/'  and  say  “aj  depends  on  af'  or  “of  causes  aj"  The 
relation  =»c  >s  transitive. 

The  following  condidoru  specify  when  a  wff  can  be  said  to  be  causaUy 
dependent  on  another  in  Tr: 

(a)  If  p[So],  p[s])...  p[s]  for  all  states  from  So  up  to  s,  (in  other  words,  p 
was  part  of  the  initial  conditions  and  never  changed),  we  say  that  pis] 
is  exogenous,  and  write  #  =>cpls]. 

(b)  If  there  exists  a  state  Sj  <  s  sudi  th^  -plsj],milp[sj+]],  where  sj+i  is 
the  immediate  successor  of  sj  in  Tr,  there  exists  m[sj,  such  that  p 
e  Rnt  mi  plsQ  for  all  si  between  /  and  j  inclusive  On  other  words. 
p  becomes  true  at  some  point  before  s  as  a  consequence  of  the 
phenomenon  rqncsented  by  m.).  we  say  mlsj]  =*c  pM- 

(c)  For  each  p  c  Pm,  we  say  p[sl  =»c  In  other  words,  for  each 

phenomenon  active  in  s.  we  say  that  the  phenomenon  being  active  is 
dependent  on  its  precondition  being  satisfied. 

(d)  If  V/  -»c  v2  according  to  the  definition  in  Section  2.1.1,  we  say  vjls] 
=9c  V2M- 

(e)  For  each  equation  e  in  D(v)  for  a  variable  v  and  a  phenomenon  m 
such  that  e  c  Em,,  we  say  mis]  *'2/^/-  In  other  words,  we  say  v 
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depends  on  the  phenomenon  giving  rise  to  the  causal  relation 
between  v  and  whiter  other  varialdes  v  depends  on. 

(0  v2[s]  =»c  vl[s]  and  vl[sl  ''21^1  ‘f  v2  vy  in  the  causal  ordering 
ins. 

(g)  v’lsiJ  =»c  vtsj.  where  Sj  is  the  state  immediately  following  sy.  and  v' 
the  time-derivative  of  v  in  sy. 

2.2  When  is  a  function  achieved? 

We  listed  in  Definition  4  different  types  of  functions  of  devices  and 
cmnponents.  In  this  subsection,  we  spell  out  the  conditions  under  which  a 
device  is  said  to  achieve  each  type  of  function  in  a  trajectory. 

Definition  6:  Let  Sz  denote  the  final  state  in  Tr,  and  Dev  deiwte  either  Devp 
or  one  of  its  components.  A  function  F  is  said  to  be  achieved  in  a  trajectory 
Tr  in  any  of  the  following  cases  depending  on  Typep-  In  all  the  cases  Cp 
must  hold  in  the  initial  state  of  Tr.  In  the  following. 

Case  1:  When  Typef  =  ToMake,  F  is  achieved  by  Tr  if 

1.1  Pp,  the  functional  goal,  holds  in  the  final  state  Sz.  We  denote  this 
by  PpfszJ.  And, 

1.2  There  is  some  device  variaUe  v,  aixl  some  state  s  in  Tr,  such  that 
v(s)  PplsxJ  (i-e.  this  fact  causally  tkpends  on  the  operation 
of  the  device). 

Case  2;  When  Typep  «  ToMaituain,  F  is  achieved  in  Tr  if  in  all  states  r,  in 
Tr ,  the  following  is  true:  For  some  sj  such  that  sj  isi’m  Tr, 

2.1  F/ryjJ,  and 

2.2  There  is  some  device  variaUe  v  such  that  v[SjJ  =>c  Ppl^J- 
Case  3;  When  Typep  »  ToControl,  F  is  adiieved  in  Tr  if 

3.1  volszJ  ^Jfv,[szJ. ...  vjszl)  (i.e.  the  functional  relation  holds 
between  the  value  of  the  controlled  variable  and  the  values  of  the 
controlling  variables  in  the  final  state), 

3.2  vjszj  =:»c  volsz]  for  1  H  (i.e.  tire  value  of  the  controlled 
variable  in  the  final  state  causally  depends  on  the  controlling 
variables),  and 

3.3  There  is  some  device  variable  v  and  some  state  s  in  Tr,  such  that 
vfa)  =»c  v«/sz/ 

Case  4:  When  Typep  ■  ToPrevent,  F  is  achieved  in  Tr  if  -Pf[s1  for  any 
state  s  in  Tr  (f.e.  F  is  achieved  in  Tr  if  the  hinctional  goal  of  F  does 
not  hold  in  any  state).  We  make  the  closed  worid  assumption  that  -p 
unless  p  is  ex^citly  known  to  hold. 
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Definition  7:  A  trajectory  Tr  is  said  to  match  a  CPD  if  there  is  a  mapping  st 
from  nodes  in  the  CPD  to  the  states  in  Tr  that  satisfies  the  following 
conditions; 

1.  for  each  node  n  in  the  CPD  tirere  is  a  state  st(n)  in  Tr  where  def(n) 

holds,  and 

2.  for  any  nodes  n]  and  n2  in  the  CPD.  stfn/j  i  st(n2)  iff  nj  <  h2.  and 

3.  for  each  causal  link  /  from  hi  to  112.  there  is  a  causal  path 
def(ni)[st(Hi)J  =>c  d^>t2)[st(H2)]  in  Tr.  where  def(n.Ut(n)}  denotes 
that  d^n)  holds  in  the  state  sdn).  Fuithermore.  if  /  has  an  attached 
qualifier.  Provided(p),p  must  hold  for  all  states  between  stfn/j  and 
st(n2)  inclusive.  If  /  has  an  attached  annoution,  By-fiuiction-of,  which 
points  to  a  component  o.  there  must  be  a  causal  path  ols]  =»c 
def(n2)[st(n2)]  for  some  state  s  such  that  st(ni}  Ss  £  S(fn2>. 

Clause  1  of  the  above  definition  ensures  that  for  each  node  in  the  CPD.  there 
is  a  state  in  Tr  that  matches  it.  Clause  2  makes  sure  that  the  temporal 
ordering  of  causes  and  eH'ects  in  tire  CPD  is  preserved  in  the  temporal 
ordering  of  their  corresponding  sutes  in  Tr.  Finally.  Cause  3  ensures  that 
the  causal  paths  exist  in  Tr  that  correspond  to  the  cau^  links  in  the  CPD. 

Armed  with  the  Definitions  1  through  7.  we  are  now  ready  to  state 
precisely  what  we  mean  by  verification  that  a  predicted  behavior  achieves  the 
expected  behavior. 

Definition  8:  We  say  that  a  trajectory  Tr  of  a  device  achieves  the  expected 
behavior  with  respea  to  a  function  F  when  the  following  conditions  are  met. 
Tri  denotes  the  subsequence  of  Tr  from  the  initial  state  up  to  and  including 
state  sj: 

1.  F  is  satisfied  in  Tr  according  to  Definition  6.  and 

2.  F  is  achieved  in  the  expected  manner,  which  is  verified  as 

Case  1:  if  TypeF  is  not  ToMaintain,  Tr  matches  one  of  the  CPD’s  of 
F  according  to  Deflnition  7. 

Case  2:  if  Typef  is  ToMaintain,  for  each  state  5  in  Tr.  a  match 
between  Tri  and  one  of  the  CPD’s  exists  such  that  r/  »  st(Nfin) 
for  the  final  node  Nfin  of  the  CPD. 

Cause  1  of  this  definition  makes  sure  that  the  function  is  achieved  in  the 
trajectory.  Causes  2  ensures  that  the  function  is  achieved  in  the  way  the 
designer  intended.  We  must  distinguish  the  cases  where  the  type  of  the 
function  is  ToMaintain  and  others,  because  if  the  function  is  to  make  or 
prevent  some  condition,  we  need  only  to  show  that  the  condition  is  brou^t 
about  (or  prevented)  in  the  intended  manner.  However,  if  the  function  is  to 
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maintain  some  condition  throughout  the  trajectory,  we  must  show  that  the 
condition  is  in  fact  brought  about  in  the  intended  manner  for  every  state. 

3.  Example:  EPS  behavior 

In  this  section,  we  demonstrate  behavior  verification  with  an  example  of  the 
electrical  power  system  (EPS)  aboard  a  satellite  orbiting  the  earth  (LMSC 
198S).  A  simplified  schematic  diagram  of  the  EPS  is  shown  in  Figure  4. 
The  componenu  of  the  EPS  are  a  solar  array  (SA  in  Figure  4),  a 
rechargeable  nickel-cadmium  battery  (BA),  a  load  representing  all  the 
electrical  loads  on  board  (LD),  a  relay  (Kl).  and  a  device  called  a  charge 
current  controller  (CCQ  for  controlling  the  relay.  The  solar  array  generates 
electricity  when  the  satellite  is  in  the  sun,  supplying  power  to  the  load  and 
recharging  the  battery.  The  battery  is  a  constant  voltage  source  when  it  is 
charged  between  6  a^  30  ampere-hours.  When  the  charge  level  is  below  6 
or  above  30  ampere-hours,  the  electromagnetic  force  produced  increases  or 
decreases  as  it  is  charged  or  discharged. 


SA:  Solar  array  t]  through  tg  :  Electrical  terminals 

LD:  Electrical  load  on  board  S} ,  $2*.  Signal  terminals 
BA:  Rechargeable  battery  Kl  :  Relay 
CCC:  (^rge  current  controller 

-  Signal  connection  - Sensor  data  connection 

-  Electrical  connection 

Figure  4:  Electrical  Power  System 

Since  the  battery  can  be  damaged  when  it  is  diarged  beyond  its  capacity, 
the  charge  current  controller  opens  the  relays  when  the  voluge  teaches  33.8 
vtdts  to  prevent  the  battery  from  being  over-charged.  The  charge  current 
controller  (CCQ  has  a  sensor  connected  to  the  the  positive  terminal  (ty)  of 
the  battery  to  sense  the  voltage.  When  it  reaches  33.8  voltt  during  a  sun-li^ 
period,  it  turns  on  the  relay  Kl.  When  die  relay  is  energized,  it  opens  and 
breaks  the  electrical  connection,  preventing  further  charging  of  the  battery 
and  switching  the  current  source  for  the  load  from  the  solar  array  to  the 
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banery.  When  the  relay  is  open  or  when  an  eclipse  period  begins,  the 
charge-level  starts  to  decrease.  When  the  charge-level  d^reases  to  6.0,  the 
voltage  will  start  to  decrease.  At  31.0  volts  the  CCC  turns  K1  off  to  close  if  it 
has  been  opened. 

The  main  purpose  of  the  EPS  is  to  supply  electricity  to  the  load 
constantly.  This  function  of  the  EPS  was  shown  in  Figure  3.  and  die  CPD's 
for  achieving  the  function  were  shown  in  Figures  1  and  2. 

2.1  EPS  MODEL  FRAGMENTS 

The  structure  of  EPS  as  shown  in  Figure  4  is  given  to  TOifE  as  a  o  Uection  of 
model  fragments  each  representing  a  components  and  their  cc  tnectioas. 
These  model  fragments  represent  the  static  aspect  of  the  situation,  and  they 
are  always  active.  In  addition,  there  are  model  fragments  representing 
various  behavioral  aspects  of  the  components.  They  are  activatedAleactivated 
during  the  simulation  according  to  the  state  of  the  world.  Some  of  them  are 
shown  below  with  their  cond'..<'>r.s  behavior  constraints  (Eni)>  ^ 

results  ('Rm)-  Voltage  and  current  are  measured  at  terminals.  The  sign 
convention  for  current  is  that  the  current  at  a  terminal  is  positive  into  the 
componeru  ovming  the  terminal.  For  the  rest  of  the  example,  we  will  use  the 
abbreviations  shown  below  in  parentheses  to  refer  to  the  model,  fragments. 

Model  fragments  concerning  behaviors  of  battery 
BaUery-normal-operating-range  (BN) 

Pm-  (Rechargeable-battery  $b)  n  6.0  amp-hours  <  (Charge-level  $b)  < 
30.0  amp-hours 
Em.  (EMF  $b)  =  33.0  volts 
Battery-over-charged  (BO) 

Pm-  (Rechargeable-battery  $b)  n  (Charge-level  Sb)  >  30.0  amp-hours 
Em-  (EMF  $b)  =  M+fCharge-level  $b) 

Battery-under-charged  (BU) 

Pm'.  (Rechargeable-batiery  $b)  r\  (Charge-level  $b)  <  6.0  amp-houis 
Em:  (EMF  $b)  =  M+(Chargc-level  $b) 

Behaviors  of  the  solar  array 
Solar-array-gencrating  (SG) 

Pm'-  (Sun  $s)  n  (Shi^ng  $s)  n  (Solar-array  $a)  n  (In-closed-circuit  $a) 
Em:  (Current-thru-terminal  (Plus-terminal  $a))  <  0 
Soiar-array-in-cdipse  (SE) 

Pm".  (Sim  $s)  n  -(Shining  $$)  n  (Solar-array  $a) 

Em'-  (Current-thru-tetminal  (Plus-terminal  $a))  »  0 
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Solar*array-in-open-drcuit  (SO) 

Pm'-  (Sun  Ss)  n  (Shining  $s)  n  (Solar-airay  $a)  n  -(In-closed-circuit  $a) 
Em'-  (Current-thru-tenninal  (Plus-terminal  $a))  s  0 

Behaviors  of  the  charge  cuneitt  controller 
Turn-kl-on  (ON) 

Pm-  (Charge-cunent-controller  $ccc)  n  (Signal  (Signal-terminal- 1  Sccc)) 
s  off  n  (Voltage-at-terminal  (Voltage-sensing-terminal  $ccc))  k 
33.8  volts 

Rm'-  (Signal  (Signal-terminal- 1  $ccc))  -  cm 
Turn-kl-off  (OFF) 

Pm'-  (Chai^-current-controller  Sccc)  n  (Signal  (Signal-terminal- 1  Sccc)) 
3  on  n  (Voltage-at-terminal  (Voluge-sensing-terminal  Sccc))  S 
31.0  volts 

Rm-  (Signal  (Signal-teTminal-1  Sccc))  s  off 

Behaviors  of  the  relay 
Relay-closed  (RC) 

Pm-  (Relay  Sr)  n  (Relay-closed-p  Sr) 

Em-  (voltage-at-terminal  (electrical-tenninal-one  Sr))  » 

(voluge-at-terminal  (electrical-terminal-two  Sr)) 
(cunent-thru-terminal  (electrical-terminal-one  Sr))  = 

-  (current-thru-terminai  (electrical-terminal-two  Sr)) 
Relay-open  (RO) 

Pm-  (Relay  Sr)  n  -  (Relay-closed-p  Sr) 

Em-  (current-thru-terminal  (elearical-terminal-one  Sr))  3 

-  (current-thru-terminal  (electrical-termirul-two  Sr)) 
Relay-closing  (CL) 

Pm-  (Relay  Sr)  n  -<Relay-closed-p  Sr)  n  (Signal  (Signal-terminal  Sr)  3 
off 

Rm-  (Relay-closed-p  Sr) 

Relay-opening  (OP) 

Pm-  (Relay  Sr)  n  (Relay-closcd-p  Sr)  n  (Signal  (Signal-terminal  Sr)  3  on 
Rm-  *<Relay-closed-p  Sr) 

In  additkxi.  there  are  dwee  model  fragments  used  to  model  the  sun.  Sun-rise 
(RISE).  Sun-set  (ST)  and  Rcsct-orbit-timc  (RST).  As  it  takes  approximately 
100  minutes  for  the  satellite  to  go  around  the  earth  once,  OrMt-time  is  a  100- 
miiHte  clock,  which  is  reset  to  0  when  it  reaches  10(X  We  model  the  sun  as 
rising  and  setting  when  Orhit-time  3  0  arxl  60  respectively  instead  of 
modeling  the  satellite  as  revolving  around  the  earth. 
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We  will  use  the  following  notations  for  quantities. 

/{  Current  through  terminal  ti  into  the  component  owning  ti 

Vi  Voltage  measured  at  termiiud  ti 

C  The  charge  level  of  the  battery 

Rid  The  resistance  of  the  load 

Rba  The  internal  resistance  of  the  battery 

EMF  The  electromotive  force  of  the  battery 

Time  Orbit-time 

Each  node  in  the  behaviors  of  EPS  can  now  be  defined  precisely  in  terms 
of  these  model  fragments  and  their  attributes.  Using  this  vocabulary,  the 
fuiKtion  of  EPS  shown  in  Figure  3  translati^  to  the  following  condition: 

EPS  Function:  ls[sj  >  0  for  all  s  e  Tr. 

The  precise  definition  of  each  node  in  CPDi  and  CPD2  using  these  model 
fragments  and  their  attributes  are  shown  in  parentheses  in  Figures  1  and  2. 

3.1  SIMULATED  BEHAVIOR. 

We  simulated  the  behavior  of  EPS  on  DME.  In  the  initial  state.  Time  is 
between  0  and  60,  sun  is  up.  the  relay  is  closed,  and  the  charge  level  is 
between  6  and  30.0  amp-hours.  From  this  initial  state,  a  number  of 
behaviors  are  possible.  In  qualitative  simu’ation  mode,  because  of  the 
ambiguity  of  qualitative  simulation,  there  are  multiple  possible  trajectories. 
Tables  1  presents  one  of  the  possible  trajectories  of  EPS  generated  by  DME. 
The  variable  values  are  shown  with  their  magninide  and  the  sign  of  their 
derivative.  In  a  sute  where  the  derivative  is  undefined,  the  sign  is  shown  as 
X .  The  tight  most  column  of  the  table  shows  the  set  of  active  models  in 
each  state.  The  set  of  all  active  model  fragments  in  each  state  is  actually 
much  larger,  but  since  most  of  them  represent  components,  terminals, 
junctions,  etc.  and  are  active  throu^out  the  simulation,  we  show  only  the 
OIKS  that  change  their  aaivation  status  at  some  point.  A  model  fragment  that 
becomes  activated  in  a  given  state  is  shovm  in  bold.  A  x  over  a  model 
fragment  indicates  that  it  becomes  deactivated  in  the  given  state. 

3.1.1  Trajectory  Tri,  The  behavior  we  consider  is  summarized  in  Table  1. 
In  the  initial  state  sq,  siiKc  the  sun  is  up  and  the  relays  are  closed,  the  charge- 
level  is  increasing.  When  it  eventu^y  reaches  30  amp-hours  (s/).  the  battery 
enters  the  over-charged  state  arul  the  voltage  level  starts  to  rise.  When  it 
leaches  33.8  (sj),  CCC  changes  dK  signal  to  on  (54),  and  K1  opens  (55).  At 
this  point,  the  solar  array  stops  geiKrating  current,  and  the  battery  starts  to 
discharge.  The  charge  level  and  dK  voltage  starts  to  decrease.  Soon,  the  sun 
sets  (S7).  As  the  charge  level  continues  to  decrease,  the  battery  returns  to  the 
normal  operating  range  (jp)  It  eventually  becomes  under-diarged  (s/ /),  and 
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the  voltage  starts  to  decrease  below  33.0  volts.  When  it  reaches  31.0  volts 
(j/2).  the  signal  to  K1  is  turned  off  (s/j).  and  K1  closes  (j/4).  Soon,  the  sun 
rises  again  (s;/).  and  charging  resumes. 

Following  are  equations  that  are  generated  by  DME  during  the 
simulatioa  On  the  right  of  each  equation,  we  indicate  the  model  fragment 
that  gives  rise  to  the  equation.  “Junction”  indicates  that  the  equation  was 
generated  as  a  behavior  constraint  of  electrical  junction  model  fragments 
(not  shown  in  Figure  4  to  simplify  the  figure.)  Note  that  es'  is  applicable 
when  ey  is  not.  and  vice  versa. 


ej:l2^-n 

SG 

<7;  Vs  =  Rid  Is 

LD 

e2  :  l2  *13  -0 

junction 

eg:  V7  =  Vs 

junction 

eg:  13  +  14=0 

RC 

eg:  Vs=  V4 

junction 

<4: 14  + 15+  17  =  0 

Junction 

eio:  V4  =  V3 

RC 

es:EMF  =  M+(C} 

BO  or  BU 

eji-  V3  =  V2 

junction 

ef:  EMF  =  33.0 

BN 

ei2:  dCldt  =  I7 

BA 

ef.  V?  =  EMF  +  Rba  l7  BA 
Rtd  and  Rba  are  exogenous. 

The  causal  ordering  among  variables  in  the  states  where  the  relay  is  closed 
is  shown  in  Figure  S.  The  causal  ordering  when  the  relay  is  open  is  mostly 
similar  except  that  the  links  from  I2  to  I3  and  from  I3  to  I4  an  missing^.  In 
the  figure,  the  variable  at  the  head  of  an  arrow  is  causally  dependent  on  the 
variable  at  the  tail.  The  causal  links  are  labeled  with  equations  that  are 
responsible  for  the  link.  The  link  labeled  i  is  an  integration  link  to  a  variable 
from  its  derivative.  Variables  Is,  V5, !?,  and  Vy  ate  inter-dependent.  The 
equations  responsible  for  their  inter-dependence  ate  <4,  e^,  €7  and  eg- 
We  prove  that  this  trajectory  satisfies  the  expected  behavior  of  EPS.  Due 
to  the  limitation  of  space,  we  omit  the  proof  that  the  qualifiers.  Provided,  and 
By-function-of,  on  causal  links  are  satisfied.  The  proof  of  these  conditions  is 


Figure  3:  Causal  ordering  when  KI  is  closed 


1  Whm  the  beucry  is  in  its  normal  opersiini  range,  the  causal  ordering  is  slightly 
dtfTeicnt  since  tf  inslead  of  ej  is  appUeabie,  eliminating  the  link  from  C  to  EMF. 
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Table  1:  Trajectory.  Trj,  of  EPS 
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(1)  That  the  function  of  EPS  is  satisfied  by  behavior  Try  is  clear  from  Table 
1  since  /j  >  0  in  all  states.  That  EPS  or  its  components  takes  part  in  bringing 
about  the  fulfillment  of  the  goal  is  subsumed  by  the  proof  in  part  (2)  and 
(3). 

(2)  The  following  proof  applies  for  s  being  one  of  sutes  so  to  S4,  aadsjy 
through  S22>  where  the  conrhtion  of  CPDi.  (Shining  Sun),  holds. 

Let  st(ni)  -  st(n2)  =>  sifnjj  «  st(n4)  *  s.  It  fr^ows  that  srfn/l  ^  st(n2)  ^ 
st(s3)  ^  st(S4).  Since  (Shining  Sun)(s]  n  (<  I2  0)[sj  n  (>  dCidt  0)[sl  n  (> 
h  0)[s],  vniaw  d4f(n)[sj  for  am  n.inCPDi. 

Proof  of  h:  (Shimng  Sun)[sJ  =»c  l2lsl‘ 

(Shining  Sun)[s]  SGlsJ  because  of  Definition  S.c. 

SG{sJ  I2M  because  of  Definition  S.e. 

It  follows  that  (Shining  Sun)[s]  s>c  f2M- 
Proof  of  /j.-  Igls!  =*c  dC/dt[s] 

I2  -»c  dCldt  in  s  as  shown  in  Figure  S. 

It  follows  that  l2[s]  dC/dt(sl  because  of  Definition  5.d. 

Proof  of  I3:  l2[sj  =*c  ISfsl 

/2  -»c  (5  in  r  as  shown  in  Figure  S. 

It  follows  that  l2[s]  =*c  h  because  of  Definition  S.d. 

Therefore.  CPDj  of  B*S  is  achieved  in  states  sq.  sj,  and  sjy  through  s23  of 
Trj. 

(3)  The  following  is  true  for  s  being  one  of  states  35  throu^  5/6.  where  the 
condition  of  CPDg,  -(Shining  Sun)  u  (Active  BO),  holds. 

Let  st(n3)  s  s^no)  »  s.  It  follows  that  sifnjl  a  st(n{i). 

Since  (<  dCldt  0)[s]  n  (>  I5  0)(sj,  we  have  def(n)[s}  for  all  nodes  n  in 
CPD2. 

Proof  of  I4: 15M  =>c  dC/dt[sl  n  dCldt[s]  =»c  (5/^/ 

/5  -»c  dCldt  in  5  as  shown  in  Rgure  S.  It  follows  that  15(5}  =»c  dC/dt[sJ 
because  of  Definition  S.d. 

Therefore.  CPDg  of  EPS  is  achieved  in  states  S2  through  s/fi  of  Trj. 


3.  Discussion 


In  this  paper,  we  fonnalized  a  number  of  notions  that  were  relatively 
informally  specified  in  the  FuiKtkmal  Repteserttadon  language  and  defined 
matching  between  an  expected  behavior  represented  in  Functional 
Representation  and  a  predicted  bduvior.  We  also  demonstrated  its  use  in 
deciding  whether  a  particular  trajectory  of  a  device  achieves  an  expected 
behavior. 
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Our  primary  goal  is  to  use  the  knowledge  of  fimctions  and  expected 
behavior  for  the  purpose  of  design  verification.  It  is  important  that  the 
definition  of  behavior  verification  we  have  presented  is  not  biased  towards 
any  particular  perspective  about  what  are  more  important  than  others  as  a 
causal  factor.  In  other  words,  it  does  not  require  that  the  function  or  the 
expected  behavior  be  described  from  a  particular  poim  of  view.  This 
definition  of  verification  of  a  predicted  behavior  with  req>ect  to  a  function 
and  an  expected  behavior  is  inclusive  etunt^  to  allow  a  trajectory  to  matdt 
many  representations  of  functions  or  expected  behaviors.  Likewise,  there 
can  be  any  number  of  trajectories  that  can  be  shown  to  match  a  given 
expected  behavior  as  there  can  be  any  number  of  designs  dun  accmnidish  the 
same  functioiudi^.  Thus,  the  m^jping  between  trajectories  aiHl  an  expected 
behavior  is  many  to  many.  However,  if  the  goal  is  to  verify  that  a  predicted 
behavior  achieves  a  given  expected  bduvior,  this  non-uniqueness  of  a  match 
is  not  a  problem.  Our  definition  does  rx>t  establish  that  the  given  expected 
behavior  is  the  only  correct  causal  story  for  a  given  trajectory,  nor  that  the 
trajectory  is  the  o^y  correct  way  to  achieve  the  function.  However,  the 
definition  does  establish  that  a  given  design  achieves  the  function  in  an 
expected  marmer,  which  is  what  is  needed  for  our  purpose  of  design 
verification. 

Our  next  step  is  to  implement  a  program  that  takes  a  functional 
representation  and  a  trajectory  and  automatically  proves  whether  or  rtot  the 
expected  behavior  is  realized  in  the  trajectory. 

3.1  RELATH)  WORK 

Bradshaw  and  Young  (1991)  and  Franke  (1991)  have  also  proposed 
representations  of  the  knowledge  of  a  purpose  and  their  use  in  design.  They 
represent  the  intended  function  in  a  manner  that  is  similar  to  the  way 
functions  are  represented  in  FurKtional  Representation.  Bradshaw  and 
Young  built  a  system  called  Doris,  which  uses  knowledge  of  purpose  for 
evaluating  behaviors  generated  by  qualitative  simulation  as  well  as  for 
diagnosis  and  explarution. 

The  focus  of  Fianke’s  work  on  representing  functions  is  slightiy  different 
horn  oun  or  Bradshaw  and  Young  in  that  he  represents  the  purpose  of  a 
design  modification  and  not  that  of  a  whole  device.  He  developed  a 
representation  scheme,  called  TED,  in  which  he  expresses  the  purpose  for 
making  a  modification  5  in  a  stnicture  using  the  same  function  types  as  those 
in  Fonctional  Representation.  Thus,  in  order  to  prove  that  a  ftmction  is 
achieved  by  a  modification  S,  he  most  compare  the  behavior  of  structure  M 
and  that  of  ,  which  is  hi  with  the  modification  8.  Another  important 
characteristic  of  TED’s  representation  of  functions  is  that  it  can  be  a 
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sequence  (nM  necessarily  a  linear)  of  partial  descriptions.  The  representation 
of  a  fimction  in  TED  ty^cally  says  guarvitees  a,"  where  rris  a  sequence, 
caUed  scenario,  of  partial  descriptions.  The  sequence  of  partial  descriptions 
is  matched  against  states  in  a  sequence  of  qu^tadve  states  generated  by 
QSIM. 

The  most  important  difference  between  our  work  and  the  works  by 
Franke’s  or  by  Bradshaw  and  Young’s  is  diat  we  take  not  only  the  funcdons 
but  also  die  causal  interacdons  into  account  in  evaluating  behavior.  We  feel 
that  it  is  important  to  test  whether  it  is  in  faa  the  causal  processes  intended  by 
the  designer  that  ate  responsible  for  bringing  about  the  achievement  of  the 
fiincdonai  goal,  since  the  sadsfacdon  of  the  fimcdonal  goal  does  not 
necessarily  indicate  that  die  design  is  ftmcdoning  as  intended.  We  believe 
diat  evaluating  a  trajectory  with  respect  to  the  causal  process  as  well  as  the 
fiincdon  allows  one  to  uncover  Uddm  flaws  in  a  design  which  may  otherwise 
go  undetected. 
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Abstract  I 

Undenunding  the  design  of  in  engineered  device  requires 
both  knowledge  of  the  general  physical  principles  that 
determine  the  behavior  of  the  device  and  knowledge  of  what 
the  device  is  intended  to  do  (i.e..  iu  functional 
specification).  However,  the  majority  of  work  in  model- 
based  reasoning  about  device  behavior  has  focused  on 
modeling  a  device  in  terms  of  general  physical  principles  or 
intended  functionality,  but  not  both.  In  order  to  use  both 
functional  and  behavioral  knowledge  in  understanding  a 
device  design,  it  is  crucial  that  the  fictional  knowledge  is 
represented  in  such  a  way  that  it  has  a  clear  interpretation  in 
terms  of  actual  behavior.  We  propose  a  new  formalism  for 
representing  device  functions  with  well-defined  semantics  in 
terms  of  actual  behavior.  We  call  the  language  CFRL 
(Causal  Functional  Representation  Language).  CFRL 
allows  the  specification  of  conditions  that  a  behavior  must 
satisfy,  such  as  occutrence  of  a  temporal  sequence  of 
expected  events  and  causal  relations  smong  the  events  and 
the  behavior  of  device  components.  We  have  used  CFRL  as 
the  basis  for  a  functional  verirication  program  whidi 
determines  whether  a  behavior  achieves  an  intended 
functioa 

Introduction 

Understanding  the  design  of  an  engineered  device  requires 
both  knowledge  of  the  general  physical  principles  that 
determine  the  behavior  of  the  device  and  knowledge  of 
what  the  device  is  intended  to  do  (i.e..  iu  functional 
specincation).  However,  the  majority  of  work  in  model- 
based  reasoning  about  device  behavior  has  focused  on 
modeling  a  device  in  terms  of  general  physical  principles 
or  intended  functionality,  but  not  both.  For  example,  most 
of  the  work  in  qualiutive  physics  has  been  concerned  with 
predicting  the  behavior  of  a  device  fivoi  in  physical 
structure  and  knowledfB  of  genoalphyical  principle  In 
that  work,  gi^  impottaoce  has  been  placed  on  preveming 
a  pre-conceived  notion  of  an  intended  function  of  the 
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device  from  influencing  the  system's  reasoning  methods 
and  representation  of  physical  principles  in  order  to 
guarantee  a  high  level  of  "objective  truth"  in  the  predicted 
behavior.  In  contrast,  in  their  work  based  on  the  FR 
(Functional  Representation)  language  (Sembugamowthy 
A  Chandiasekaran  1986)  (Keuneke  198Q.  Chatdiasdtaran 
and  his  colleagues  have  focused  mostly  on  modeling  a 
device  in  terms  of  what  the  device  is  intended  to  do  and 
how  those  intentions  are  to  be  accomplished  through 
causal  interactions  among  componenu  of  the  device. 

Both  types  of  knowledge,  functional  and  behavioral, 
seem  to  be  indispensable  in  fully  undnstanding  a  device 
design.  On  the  one  hand,  knowledge  of  intend^  function 
alone  does  not  enable  one  to  reason  about  what  a  device 
might  do  when  it  is  placed  in  an  unexpected  condidon  or  to 
infer  the  behavior  of  an  unfamiliar  device  from  iu 
structure.  On  the  other  hand,  knowledge  of  device 
structure  and  general  physical  principles  may  allow  one  to 
predict  how  the  device  will  behave  under  a  given 
condition,  but  without  knowledge  of  the  intended 
functions,  it  is  impossible  to  determine  if  the  predicted 
behavior  is  a  desiiable  one,  or  what  aspect  of  the  behavior 
is  significant 

In  order  to  use  both  functional  and  behavioral 
knowledge  in  understanding  a  device  design,  it  is  cruciai 
that  the  functional  knowledge  is  lepresemed  in  such  a  way 
that  it  has  a  clear  interpretation  in  terms  of  actual  behavior. 
Suppose,  for  example,  that  the  function  of  a  charge  current 
controller  is  to  prevent  damage  to  a  battery  by  cuttiiig  off 
the  charge  current  when  the  battery  is  fully  charged.  To  be 
able  to  determine  whether  this  function  is  actually 
accomplished  by  an  observed  behavior  of  the  device,  the 
representation  of  the  function  must  specify  conditions  that 
can  be  evaluated  against  the  behavior.  Such  conditions 
might  include  occurrence  of  a  temporal  sequence  of 
expecBd  events  and  causal  reiatioos  among  the  events  and 
the  componenu.  Without  a  clear  semantics  given  to  a 
representatioo  of  fiioctioos  in  terms  of  actual  behavior,  it 
would  be  iaqmssible  to  evaluaia  a  design  based  on  iu 
predicted  behavior  and  intended  ftmctions. 

While  it  is  unpottant  for  a  functional  specification  to 
have  a  clear  interpretation  in  terms  of  acturtbelMvior.it  is 
also  desnMe  for  tte  langnage  for  spacifyiat  functions  u 
be  independent  of  any  particnlar  system  used  for 
simulation.  Thou^  dim  ate  a  anmbm  of  alternative 
methods  for  predicting  behavior,  such  as  numerical 
simulation  with  discrete  time  ste|M  or  qualitative 
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simulation,  a  functional  specification  at  some  abstract  level 
should  be  intuitively  understandable  without  specifying  a 
particular  simulation  mechanism.  If  a  functional 
specification  language  was  dependent  on  a  spMific 
simulation  language  or  mechanism,  a  separate  functional 
specification  language  would  be  needed  for  each  different 
simulation  language,  which  is  clearly  undesiiable.  What  is 
needed  is  a  functional  specification  language  that  has 
sufficient  expressive  power  to  support  descriptions  of  the 
desired  functions  of  a  variety  of  devices.  At  the  same 
time,  the  language  should  be  clear  enough  so  that  for  each 
simulation  mechanism  used,  it  can  be  given  an 
unambiguous  interpretation  in  terms  of  a  simulated 
behavior. 

An  essential  element  in  the  description  of  a  function  is 
causality.  In  order  to  say  that  a  device  has  achieved  a 
function,  which  may  be  expressed  as  a  condition  on  the 
state  of  the  world,  one  must  show  not  only  that  the 
condition  is  satisfied  but  also  that  the  device  has 
participated  in  the  causal  process  that  has  brought  about 
the  condition.  For  example,  when  an  engineer  designs  a 
thermostat  to  keep  room  temperature  constant,  the  design 
embodies  her  idea  about  how  the  device  is  to  work.  In 
fact,  the  essential  part  of  her  knowledge  of  its  function  is 
the  expected  causal  chain  of  events  in  which  it  will  take 
pan  in  achieving  the  goal.  Thus,  a  representation 
formalism  of  functions  must  provide  a  means  of 
expressing  knowledge  about  such  causal  processes. 

We  have  develop  a  new  representational  formalism 
for  representing  device  functions  caUed  CFRL  (Causal 
Functional  Representation  Language)  that  allows  functions 
to  be  expressed  in  terms  of  expected  causal  chains  of 
events.  We  have  also  provided  the  language  with  a  well- 
defined  semantics  in  terms  of  the  type  of  behavior 
representation  widely  used  in  model-bi^.  qualitative 
simulation.  Finally,  we  have  used  CFRL  as  the  ba»  for  a 
functional  verification  program  which  determines  whether 
a  behavior  achieves  an  intmided  function. 

This  paper  is  organized  as  follows:  We  first  describe  the 
representation  of  behavior  over  time  in  terms  of  which  the 
semantics  of  CTRL  will  be  defined  and  our  asnimpiions 
about  the  modeling  and  simulation  schemes  that  produce 
such  a  behavior  description.  We  then  present  the  CFRL 
language  and  define  its  semantics  in  terms  of  behavior. 
We  close  with  a  discussion  and  summary. 

Behavior  Representation 

Before  describing  CFRL,  we  briefly  describe  the  behavior 
representatioa  in  terns  id  which  the  nmamics  of  CHU. 
will  be  defined.  A  physical  sitnation  is  modeled  as  a 
collection  of  modtlfrmgmmtt,  each  of  which  represents  a 
physical  object  or  a  conceptually  distiaa  physical 
phenomenon,  such  as  a  particular  aspect  of  component 
behavior  or  a  physical  process.  A  model  fragment 
reptesentii^  a  phenomenon  specifies  a  set  of  conditions 
under  which  the  phenomenon  occiin  and  a  set  of 
consequences  of  the  phenomenon.  The  condUons  specify 


a  set  of  instances  of  object  classes  that  must  exist  and  a  set 
of  relations  that  must  hold  among  those  objects  and  their 
attributes  for  the  phenomenon  to  occur.  The  consequences 
spe^  the  functional  relations  the  phenomenon  will  cause 
to  hold  among  the  objectt  and  their  attributes. 

Model  fitapaena  can  represent  phenoinena  as  occurring 
continuously  while  the  fiagment's  conditions  hold  or  as 
events  that  occur  instantaneously  when  the  conditions 
become  true.  The  consequences  of  a  model  fragment  that 
represents  an  evem  are  f^  to  be  asserted  resulting  from 
the  event,  whereas  the  consequences  of  a  model  fragment 
that  represents  a  continuous  process  are  sentences  (e.g., 
ordinary  differential  equations)  which  are  true  while  the 
phenomena  is  occurring. 

When  there  exists  at  time  r  a  set  of  objects  represented 
by  model  fragments  mi  to  mj  that  satisfy  the  conditions  of 
a  model  fragment  mo,  we  say  that  an  instance  of  mg  is 
active  at  that  time.  We  will  call  mi  through  mj  the 
purdcqMMs  of  the  mo  instance. 

Representadon  ot  physical  knowledge  in  terms  of  model 
fragiMnts  is  a  generalization  of  the  represratation  of 
physical  processes  and  individnals  in  Qualitaiive  Process 
Theory  (Forbus  1984).  There  are  several  systems, 
including  the  Device  Modeling  Environment  (DME) 
(Iwasaki  A  Low  1991)  the  Qualitative  Process  Engine 
(QPE)  (Rsrbus  1989).  and  the  Qualiutive  Process 
Compiler  ((^C)  (Crawf^  Farquhar  A  Kuipers).  that  use 
similar  rqiresentations  for  physical  knowle^e  to  predict 
the  behavior  of  physical  devices  over  time.  Though  the 
ways  these  systems  actuaUy  perform  predictioo  differ,  the 
bask  idea  bel^  all  of  them  is  the  following:  For  a  given 
siuuuion,  the  system  identifies  active  model  fragment 
instances  by  evaluating  their  conditions.  The  active 
instances  give  rise  to  equations  representing  the  functional 
relations  that  must  hold  among  variables  as  a  consequence 
of  the  phenomena  taking  place.  The  ^nations  are  then 
used  to  determine  the  next  stale  into  which  the  device  must 
move. 

We  assume  that  a  behavior  is  a  linear  sequence  of  states. 
The  output  of  a  qualitaiive  simulaiion  system  such  as  QPE, 
DME,and(^isusuallyatteeoragia^ofstaKS.  Each 
path  through  the  graph  represents  a  posdUe  behavior  over 
time.  We  will  refer  to  such  a  path.  Le.,  a  linear  sequence 
of  stales,  as  a  trqjeemry. 

A  state  reptMeius  a  situation  in  which  the  physical 
system  betag  modeled  is  in  at  a  partkultf  time.  ’A 
partkalar  tne'*  here  can  be  a  time  poim  or  intervaL  We 
win  not  attunae  any  specific  model  of  time  ia  this  paper. 
The  only  asaumptions  about  tune  that  we  make  are:  (l)die 
limes  assodaied  with  differem  sates  do  not  ovmlap;  (2) 
when  a  staa  sj  immndiaaly  follows  Ji  ia  a  behavior,  dare 
is  no  other  ”iiiao”  that  foils  between  the  times  (periods) 
associaad  with  s(  and  and  (3)  every  staa  has  a  unique 
successor  (pwdeceisor)  unleas  it  is  the  final  (iiutfoO  sate, 
ia  which  case  it  has  none. 

In  our  aMdeiinf  scheme,  each  staa  has  a  set  of  variable 
values  and  predioas  that  hoM  in  the  staa.  In  addition. 


each  scale  has  a  set  of  active  model  fragment  instances 
representing  the  phenomena  that  are  occurring  in  the  state. 

An  Electrical  Power  System 

This  section  presents  the  device  that  we  will  use 
throughout  the  rest  of  this  paper  as  an  example.  The 
device  is  the  electrical  power  system  (EPS)  aboard  an 
Earth  orbiting  satellite  (Lockheed  1984).  A  simplified 
schematic  diagram  of  the  EPS  is  shown  in  Figure  1.  The 
main  purpose  of  the  EPS  is  to  supply  a  constant  source  of 
electricity  to  the  satellite's  other  subsystems.  The  solar 
array  ^nerates  electricity  when  the  satellite  is  in  the  sun, 
supplying  power  to  the  load  and  recharging  the  battery. 
The  b^ry  is  a  constant  voltage  source  when  it  is  charged 
between  6  and  30  ampere-houn.  When  the  charge  level  is 
below  6  ampere-hours,  the  voltage  output  decreases  as  the 
battery  discharges.  When  the  charge  level  is  above  30 
ampere-hours,  the  voltage  output  increases  as  it  is  charged. 


SA  K1 


SA:  Solar  array 
LD:  Electrical  load  on  board 
BA:  Rechargeable  battery 
CCC:  Charge  current  controller 
KI:  Relay 

Figure  1:  An  Electrical  Power  System. 

Since  the  battery  can  be  damaged  when  it  is  charged 
beyond  its  capacity,  the  charge  current  controller  opens  the 
relay  when  the  voltage  exce^  a  threshold  to  prevent  the 
battery  from  being  over-chaiged.  The  conooUer  senses  the 
voltage  via  a  sensor  connected  to  the  poiiiive  terminal  of 
the  battery.  When  the  volttfe  is  greater  than  33.8  volts, 
the  controller  turns  on  the  relay  Kl.  When  the  relay  is 
energized,  it  opens  and  breaks  the  elecvical  connection  to 
prevent  further  charging  of  the  battery,  thereby  switching 
the  current  source  Cor  the  load  Crom  the  solar  anay  to  the 
battery.  When  the  relay  is  open  or  when  an  eclipse  period 
begins,  the  battery's  charge-level  Stans  to  decrease.  When 
the  battery  becomes  under-charged,  the  vokage  t— 
When  it  reaches  31.0  volts,  the  CCC  turns  relay  Kl  off  to 
close  it 


CFRL 

We  now  describe  the  syntax  and  semantics  of  CFRL. 
Figures  2  shows  an  example  of  the  representation  of  a 
function  of  the  EPS. 

Dfr.  ?epr.  Electrical-power-system 
Cf.  Object-set:  ?sun:  Sun  ?1:  electrical-load 
Coiiditions:  T 

Gr 

(ALWAYS 

(AND 

(->  (AND  (Shining-p  ?sun) 

(Qosed-p  (Relay-compmieni  ?eps))) 
CPDl) 

(->  (OR  (NOT  (Shining-p  ?sun)) 

((3pen-p  (Relay-component  ?eps))) 

CPD2) 

(->(AND  (>  (Electromotive-force 

(Battery-component  ?eps)) 

33A) 

(Gosed-p  (Relay-component  ?eps))) 

CPD3) 

(•>  (AND  (<  (Electromodve-farce 

(Battery-component  ?eps)) 

31.0) 

(Open-p  (Relay-component  ?eps))) 
CPD4))) 

Figure  2-a:  FuiKtion  Fj  of  EPS 

We  consider  a  function  to  be  an  agent's  belief  about  how 
an  object  is  to  be  used  in  some  context  to  achieve  some 
fjfect.  Thus,  our  representation  of  a  function  specifies  the 
object,  the  context,  and  the  effect  However,  it  does  not 
specify  an  agent  which  is  implicitly  assumed  to  be 
whoevW  is  using  the  representatioo.  Ftxtmally.  a  function 
is  defined  as  follows: 

Dcfinhioa  1:  Function 
A  function  F  is  a  triplet  {Dp,  Cf,  Cf),  where: 

D/r  denotes  the  device  of  which  F  is  a  function. 

Cp  denotes  the  context  in  which  the  device  is  U) 
function. 

Gp  denotes  the  functional  goal  to  be  achieved. 

The  device  specification.  Dp,  specifies  die  class  of  the 
device  and  the  symbol  by  wbidi  the  device  win  be  lefened 
toindietestof  ihedefiidtionoff .  The  example  in  Figure 
2-a  states  that  the  function  is  of  an  Electiical-power- 
sysmm  which  will  be  referred  to  as  ftps  in  the  rest  of  the 
tfednidoo. 
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nl:  (Shining-p  ?5un). 


causal,: 


1^  (>  (Current  (-t-tenninal  (Load-component  ?eps))  0) 


(by*functionH)f  (solar-airay«componcnt  7cps)) 
>^(by-hincti(m>of  (solar-arrayKomponcnt  ?eps)) 


(>  (d  (Stored<haige  (Battery-component  ?eps))/dt)  W 


w7:  (>  (Current  (-t-terminal  (Load-component  ?eps)) 


n6:  (>  (Electromotive-force 

(Battery-component  ?eps))  0) 


(by-function-of  (Battery-component  ?eps)) 


m:  (<  (d  (Stored-diarge  (Battery-component  ?eps))/dt)  0) 


nlO:  (>  (Electromotive-force 

(Battery-component  ?eps))  (by-function-of  (Controller-component  ?eps)l 


causal,  < 


^11:  (Open-p  (Relay-component  ?eps)r 


CPD4: 


nl3;  (<  (Electromotive-forcc  ^ 

(Battefy-component  Tbps))  31.^ 


^^S,,Mby-faiiction-of  (Conlnrtlcrcoaipoiicnt  ?cps)) 
causal, 

(Cloaad-p(Rdaf-cofnponent?ap^ 


FifUR2-b:  OPD'sofFuactlQoFi  oTEPS. 


I 


The  notion  of  a  device  function  assumes  some  physical 
context  in  which  the  device  is  placed,  and  C/r  is  a 
specification  of  such  a  context  Cp  consists  of  two  pans,  a 
set  of  objects  and  a  set  of  conditions  on  those  objects.  For 
example.  Figure  2-a  states  that  there  must  exist  an  instance 
of  Sun  and  an  instance  of  electrical  load.  The  conditions 
must  hold  throughout  a  behavior  in  order  for  the  function 
to  be  verified  in  the  behavior. 

Formally,  the  Object-set  of  a  Cf  is  a  list  of  pairs  {var. 
type} ,  where  vor  is  a  symbol  to  be  used  in  the  description 
of  F  to  refer  to  the  object,  and  type  is  the  type  (class)  of  the 
object.  Conditions  is  a  logical  expression  involving  the 
variables  defined  in  the  Object-set  and  D/r. 

The  third  part  of  the  function  definition.  Of,  qrecifies 
the  behavior  to  be  achieved  by  the  device  used  in  a  specific 
manner.  Gf  of  a  function  is  represented  as  a  Boolean 
combination  of  Causal  Process  Descriptions  (CPDs)  and 
conditions  involving  the  variables  defined  in  Df  and  the 
Object-set  otCp-  Each  CPD  is  an  abstract  description  of 
expected  behavior  in  terms  of  a  causal  sequence  of  events. 
In  the  following,  we  formally  define  a  CPD. 

Causal  Process  Descriptions  (CPD's) 

Figure  2-b  shows  examples  of  CPD's  which  are  pan  of  the 
functional  specification  of  the  EPS.  A  CPD  is  a  directed 
graph,  in  which  each  node  describes  a  state  and  each  arc 
describes  a  temporal  and  (optionally)  a  causal  relation 
between  states. 

A  node  specifies  a  condition  on  a  state.  The  condition  is 
a  logical  sentence  about  the  state  of  the  world  at  some  time 
using  the  variables  defined  in  the  Df  and  Cp  portions  of 
the  function.  For  example,  the  node  n;  in  Figure  2*b  states 
the  condition  that  the  sun  be  shining.  One  or  more  nodes 
in  each  CPD  are  distinguished  as  the  initial  node(s).  In  the 
figures,  the  initial  nodes  are  indicated  with  a  thick  oval.  A 
condition  specified  by  a  node  can  contain  AND  and  OR  as 
logical  connectives.  When  the  meaning  is  clear,  we  will 
use  the  name  of  a  node  to  refer  to  the  condition  re{Mesenied 
by  the  node. 

The  arcs  in  a  CPD  are  directed  and  specify  temporal  and 
causal  relations  among  nodes.  An  arc  has  the  following 
attributes: 

source;  The  node  at  the  tail  of  the  arc. 
dcstlnathNi:  The  node  at  the  head  of  the  arc. 
caasal'flag:  An  indkaior  of  whether  the 
relatioosh^i  between  the  states  described  by  the 
sotvce  and  destination  nodes  is  causal.  (The 
relationship  is  always  iBinpotaL) 

tcaporal-rclation:  ■,  <,  or  indicatinf  the 
temporal  rehuion  between  the  states  described  by 
the  source  and  destination  nodes.  «  means  that 
the  states  described  by  the  two  nodes  are  ID  be  the 
same  state.  <  means  the  stats  described  by  the 
source  node  must  strictly  precede  the  stare 


described  by  the  destination  node,  and  i  means 
the  state  described  by  the  source  node  must  either 
be  the  same  as  or  precede  the  state  described  by 
the  destinaiion  state. 

causal'JnstUlcation:  If  an  arc  is  'causal',  one  can 
attach  a  justification  for  the  causal  relation.  A 
justification  takes  the  form  of  a  Boolean 
combiiuuion  of  the  following  predicates: 

(by-fincthm-of  <model-fTagment>). 
(widi<partidpation>of  <model-&agment>). 
The  meaning  of  these  predicates  will  be  explained 
after  we  give  a  precise  definition  of  a  causal 
relation  among  nodes. 

In  order  to  refer  to  attributes  of  arcs,  we  will  use  the 
attribute  name  (e.g.,  source,  destination,  etc.)  as  a  function 
of  the  arc  as  in  'sourcejai)". 

We  will  write  n/  =>c  "j  '‘^len  there  is  a  causal  arc  fiom 
ni  to  nj.  As  a  condition  specified  by  a  node  can  be  a 
lioolean  combination  of  conditions,  the  following  defines 
the  meaning  of  causal  relations  anong  them,  where  ei.e2. 
andey  are  conditions: 

a)  (AND  ej  «2)  ^c  <5  ■ 

(AND  (ei  ‘3)  («2  *»c  «5)) 

b)  ei  =»c  (AND  <2  ej)  ■ 

(AND(«/  =»c  <2Xe;  =»c  «J)) 

c)  (ORe/  e2)=^c  ‘3  ■ 

(OR  (e;  =>c  e3)  (*2  =>c  «J)) 

<1)  =»c  (OR  *2  «3)  ■ 

(OR(ej  =»c«2)(«;  =»c«5)) 

Semantics  of  a  CPD 

A  CPD  can  be  considered  to  be  an  abstract  specification  of 
a  behavior.  Unlike  a  trajectory,  it  does  not  specify  every 
state  or  everything  known  about  each  stare.  It  only 
specifies  some  of  the  facts  that  should  be  sue  during  the 
course  of  the  behavior  and  partial  temporal/causai 
orderings  among  those  facts.  The  intuitive  meaning  of  a 
CPDisthac 

•  For  each  node  in  the  CPD,  there  must  be  a  stare  in  the 
trajectory  in  which  the  conditioo  specified  by  the  node  is 
sadsfied,  and 

•  For  each  pair  of  nodes  directly  connected  by  an  are,  the 
causal  and  temporal  relatkm^ps  qiecified  by  the  arc 
must  exist  in  the  njectory. 

In  Older  for  us  to  evaluate  these  conditions  against  a 
behavior,  we  must  define  their  meanings  in  tenns  of  the 
languages  used  to  describe  a  (simulated  or  actual) 
behavior.  In  this  paper,  we  will  do  so  in  terms  of  the 
behavior  representation  formalism  described  earlier. 


However,  note  that  CFRL  itself  is  independent  of  the 
particular  behavior  representation  language  used,  and  that 
one  would  need  to  provide  different  deflnitions  in  order  to 
evaluate  functional  specifications  in  CFRL  again«t 
behaviors  generated  by  a  different  scheme. 

We  first  present  the  definition  of  a  causal  dependency 
relation  between  sentences  in  a  trajectory  and  the  causality 
constraints  that  can  be  associated  with  a  CPO  arc.  We 
then  defme  the  r^uirements  for  a  trajectory  to  match  a 
CPD  and  for  a  trajec^  to  match  a  function  goal.  Finally, 
we  use  those  definitions  to  deHne  the  requirements  for  a 
trajectory  to  achieve  a  function. 

A  few  words  about  notation:  We  will  attach  [s]  to  a 
sentence  to  denote  the  sentence  holds  in  state  s.  Therefore, 
p(sj  means  that  p  holds  in  state  s.  We  will  also  a-«nriaf^  a 
state  with  models  and  variables  to  denote  sentences  as 
follows: 

m[s!  :  An  instance  of  model  fragment  m  is  active  in  s . 
vlsj  :  The  value  of  variable  w  in  s.  (i.e..  an  axiom  of 
the  form  (a  ( value  v  s)  c)  for  some  constant  c.) 
We  will  use  the  relations  <.  >,  and  £  to  express 
temporal  ordering  among  sutes  in  a  trajectory.  For 
example,  for  states  si  and  S2  in  a  trajectory,  <  sj" 
means  that  sj  strictly  precedes  s2  in  time.  Note  that 
ordering  is  total  for  sutes  in  a  trajectory  because  a 
uajectory  is  a  linear  sequence  of  states,  while  the  ordmng 
is  partial  for  states  in  a  CPD. 

Intuitively,  we  say  p2  is  causally  dependent  on  pi  in 
trajectory  Tr.  written  p;  =9p2,  '•'hen  it  can  be  shown  that 
PI  being  true  in  Tr  eventually  lesuls  to  p2  being  true  in  Tr. 

Definition  2:  Causal  Dependency 

The  causal  dependency  relation,  =»,  is  a  binary  relation 
between  sentences  in  a  trajectory  with  the  following 
properties; 

1 .  For  ail  atomic  sentences  p.  states  s,  model  fragments  m. 
and  variables  v; 

a)  If  plsol,  plsiJ.  ...pis]  (i.e.,  ifp  is  part  of  the  initial 
conditions  and  is  never  chang^),  then  #  =»  p/j/. 
(And  we  say  that  pis]  is  exogenous.) 

b)  If  model  fragment  m  represents  an  event  and  asserts  p. 
and  if  there  exists  a  sute  sj  such  that  sj  <  s.  -plsjj. 
mlsj],  and  pls/J  for  all  Jk  >  j  (i.e..  p  became  true  at 
some  poim  before  s  due  to  m),  then  mlsj]  a> pis], 

c)  If  model  fragment  m  represents  a  erwtintwOT  process 
and  has  p  as  a  consequmioe,  and  if  there  exists  a  state 
Sj  such  that  sj  <  j.  -pfsy .  ntlsj] .  and  plskJ  for  aU  4; 
>  i  (i.e..p  became  true  at  some  poim  before  s  due  to 
m),  then  mtsjjaapls]. 

d)  If  model  fragmern  m  has  p  as  a  condition,  then  pfs/ 9 
mis]. 

e)  If  y  occurs  in  p  as  a  term  and  p  is  not  vfs/,  then  vfs/ 
^Pls]. 


0  If  V  is  an  exogenous  variable.  # 

g)  Fbr  all  variables  v'  such  that  v'  ->  v  is  in  the  causal 
ordering^  in  s : 

(i)  v  [s]=«v[s]  ; 

(ii)  Upls]  is  the  equation  through  which  v  depends 
on  y',  then  pis]  =»  vis]. 

h)  For  all  variables  v'  such  that  v  and  v'  are  in  a  feedback 
loop  in  the  causal  ordering  in  s: 

(i)  v'[sj  =»  v[sl  and  vfs]  =*  v’[s] : 

(ii)  For  each  equation  p  such  that  p  is  part  of  the 
feedback  lo^  and  y  appears  in  p.  p/r/  ^vls]. 

i)  If  5/  is  the  sute  immediately  following  s,  anddy  is  the 
time-derivative  of  v  in  s.  thm  dvls]  =»  vlsj], 

2.  =»  is  transitive. 

Pi  ^  PJ,  we  will  say  that  py  is  causally  dependent 
on  Pi  or  that  Pi  causes  pj.  Given  stttemenu  plsi]  and  plsj] 
such  that  plsU  ^  plsj],  we  call  the  caus^  sequence  of 
sutements  starting  from  plsU  and  leading  to  plsj]  the 
causal  path  Grom  plsU  to  plsj]. 

Having  defined  the  meaning  of  a  causal  relation  among 
sutemenu,  we  can  now  explain  the  meaning  of  the 
predicates  used  to  justify  causal  arcs  in  a  CPD. 

DennitionS:  Causalify  constraints 
Given  an  arc  n  from  node  ni  to  ay  in  a  CPD  and  a  model 
fragment  m.  causality  constraints  of  the  following  form  can 
be  associated  with  a ; 

St)  (by-funetioH'af  m)  ~  meaning  that  the  causal  path  Grom 
ni  to  ay  includes  a  consequence  of  an  instance  of  m; 

b)  (witk~partieiputi0n-of  m)  ~  meaning  that  the  causal 
path  from  ni  to  ay  iiKludes  a  consequence  of  an  instance 
of  a  model  fragment  in  which  an  instance  of  m 
participates. 

These  predicates  do  not  imply  qjecific  commitments  as 
to  how  the  components  participate  in  the  causal  process. 
They  give  the  designer  the  ca^ility  of  using  whauver 
component  has  the  desired  function,  independent  of  its 
particular  mechnism. 

We  can  now  present  the  definitions  on  which 
verification  of  a  najectocy  with  reqwa  to  a  CPD  is  based. 

Definitioad:  Matching  of  a  state  and  a  node 

A  State  r  in  a  trgjecioty  and  a  node  a  in  a  CTO  ate  said  to 
mure*  if  the  condhian  specified  in  A  is  me  in  r. 

Having  defined  the  meaning  of  a  causal  relation  among 
statements  in  a  Bgjecioty,  we  can  now  d^ne  die  meaning 


Causal  ordering  is  a  ladmMiHs  tor  detaminiiif  causal 
ilspandansy  rdaiioiis  among  variabiaa  in  a  sat  of  equations 
(Iwasaki  h  Simon  19M). 


ot'  the  causal  and  temporal  relations  between  linked  nodes 
ofaCPD. 

OennkionS:  SatisfyingtheconstraintsoTanarc 
If  a  is  an  arc  from  node  ni  to  ny  in  a  CPD,  then  the  causal 
and  temporal  constraints  of  a  are  satisfied  at  states  sj  and  sj 
if  both  of  the  following  conditions  are  satisfied: 

a)  Si  <  (»  or  $)  Sj  when  /ij  <  (=  or  S)  nj ,  respectively. 

b)  If  arc  a  is  causal  and  if  ni  and/or  nj  are  Boolean 
combinations  of  conditions,  then  the  causal  relation 
between  ni  and  nj  can  be  rewritten  as  a  Boolean 
combination  of  causal  relations  of  the  form  e/  =>c  <[/■ 
where  e/  and  ej  are  atomic  conditions.  ei[si)  =^c  ^jlsjl 
is  satisfied  if  for  every  variable^  vi  used  in  ei  and  every 
variable  vy  used  in  ey,  vilsiJ  ^  ''jl^jJ  and  the  causal 
path  from  vilsQ  to  v  jlsj]  satisfies  the  causal 
justification  on  a. 

Definition  6:  Matching  ofa  CPD  and  a  tritjactory 
Let  7*  be  a  trajectory  consisting  of  a  linear  sequence  of  m 
states,  SI  through  Sm-  CPDi  be  a  CPD  consisting  of  a 
set  of  nodes.  A//,  and  a  set  of  arcs.  A/.  CPDi  and  T  are 
said  to  match  iff  all  the  following  conditions  are  satisfied: 

a)  The  initial  nodes  of  CPDi  match  the  initial  state  s;  in  T. 

b)  For  each  remaining  node  n  in  N/,  there  exists  a  state  in 
T  that  matches  n  such  that  for  every  arc  a  in  A;  from 
nodes  ni  to  ny,  the  temporal  and  causal  constraints 
specified  by  a  are  satisfied  by  the  states  matched  to  ni 
and  ny. 

Representation  of  the  Functional  Goal  (Gf) 

The  functional  goal  of  a  function  (denoted  by  Gf)  is 
represented  as  an  expression  consisting  of  CPDs, 
conditions,  quantifiers,  and  Boolean  connectives.  Nested 
expressions  using  connectives  are  allowed,  but  a  quantifmr 
cannot  appear  in  the  scope  of  another  quantifier.  Each 
CPD  must  appear  in  the  scope  of  one  and  only  one 
quantifier.  There  are  two  quantifiers,  ALWAYS  and 
SOMETIMES.  Connectives  are  AND,  OR,  IMPLIES,  and 
NOT.  Syntactically,  the  connectives  are  used  in  the  same 
way  as  ordinary  logical  connectives.  The  following  are 
example  G/r  expressions: 

(ALWAYS  (AND  cpd/  cpd2  (OR  cpdfj  q>d4))) 

(OR  (ALWAYS  epd;) 

(SOMETIMES  (AND  cpd2  cpd3  ))) 

(ALWAYS  (NOT  cpdj  )) 


^  The  variables  used  in  CFRL  can  be  different  from  the 
variables  in  terms  of  which  the  trajectory  states  are  definod,  since 
CFRL  descriptions  represent  a  device-level  perspective,  while 
states  in  the  trajectory  represem  a  component  or  physical  process- 
level  perspective.  Correspondences  between  Cn  variables  and 
trajwtory  variables  are  made  when  the  frtnciion  is  matched 
against  a  specific  trajectory. 


Quantifiers  align  the  initial  nodes  of  the  CPDs  in  their 
scope  as  well  as  specify  whether  the  described  behavior 
must  hold  in  every  subs^uence  of  the  najectory  or  only  in 
some  of  them.  The  connectives  and  quantifiers  are  u  be 
interpreted  as  specified  in  the  following  defmition  of 
matching  a  Gf  and  a  trajectory. 

Dcfinitioii  7:  Matching  of  a  Gf  and  a  trajectory 

Let  r  be  a  trajectory  consisting  of  a  linear  sequence  of  m 
states.  SI  through  Sm;  Ti  denote  subsequences  of  T  from  si 
through  Sm;  and  <cpd-exp>  denote  a  Bodean  combination 
of  CPD's  and  conditions.  Then: 

a)  (ALWAYS  <cpd-exp>)  matches  T  iff  <cpd-exp> 
matches  Tj  for  each  Ti  (i »  1  to  m). 

b)  (SOMETIMES  <cpd-exp>)  matches  T  iff  <cpd-exp> 
matches  T/  for  some  Ti  (i »  1  to  m). 

c)  (AND  <cpd-expo>  <cpd~expi>  ...)  matches  T  iff  every 
conjunct  matches  T. 

d)  (OR  <cpd~expo>  <cpd-expi>  ...)  matches  T  iff  at  least 
one  of  the  disjuncts  matches  T. 

e)  (NOT  <cpd-txp>)  matches  T  iff  <cpd~exp>  does  not 
match  r. 

0  (IMPLIES  <cpd-expo>  <cpd-expi  >)  matches  T  iff 
<cpd-expo>  does  not  match  T  or  <cpd-expi>  does 
match  r. 

g)  Condition  c  matches  T  iff  c  is  true  in  the  initial  state  of 
T. 

Finally,  we  complete  the  definition  of  the  meaning  of  a 
function,  as  follows: 

Definitioo  8:  A  trajectory  achieving  a  function 
A  trajectory  T  achieves  a  function  F  when  the  cemdition 
specified  in  Cf  holds  throughout  T  and  Gf  matches  T. 

Discussion  and  Summary 

In  this  paper,  we  have  presented  CFRL,  a  language  for 
specifying  an  expected  fijnetion  of  a  device  and  defined  its 
semantics  in  terms  of  the  type  of  behavior  representation 
widely  used  in  model-based  qualitative  simulation.  The 
language  allows  one  to  explicitly  state  the  physical  context 
in  which  the  function  is  to  be  achieved  and  to  describe  the 
function  as  an  expected  causal  sequence  of  events.  Since 
the  concept  of  causal  interactions  among  componenu  is 
essential  to  the  understanding  of  a  function,  the  bnguage 
allows  explicit  representatioo  of  causal  interactions  and 
constraints  on  such  interactions. 

CFRL  is  based  on  the  work  on  Functional 
Represenution  (Sembugamoorthy  ft  Chandrasdearan 
19^,  and  it  is  a  funher  extension  of  die  work  ptesmed  in 
(Iwas^  ft  Chmdnaekaran  1992).  We  have  extended  the 
expressive  power  of  the  function  specification  languages 


described  in  those  papers  and  have  provided  a  formal 
foundation  for  the  semantics  of  the  resulting  language. 

Franke  (Franke  1991)  also  proposed  matching  design 
intent  with  simulated  behavior.  Unlike  other  wtxlc  on 
functional  representation,  he  focuses  on  reptesendng  the 
purpose  of  a  design  modification  and  not  that  of  a  device 
itself.  He  develop  a  representation  scheme,  called  TED. 
in  which  he  expresses  the  purpose  for  making  a 
modiflcation  in  a  structure.  TED's  representation  of  a 
function  can  be  a  sequence  (not  necessarily  a  linear)  of 
partial  descriptions,  which  is  matched  against  states  in  a 
sequence  of  qualitative  states  generated  by  QSIM.  To 
prove  that  a  function  is  achieved  by  a  modification,  he 
compares  the  behavior  of  the  original  structure  and  th^  of 
the  modified  structure. 

Bradshaw  and  Young  (Bradshaw  &  Young  1991)  also 
represent  the  intended  function  in  a  matuier  similar  to 
Functional  Representation.  They  built  a  system  called 
DORIS,  which  uses  the  knowledge  generated  by 
qualitative  simulation  for  evaluating  device  behavior  as 
well  as  for  diagnosis  and  explanation. 

The  most  important  charKteristic  that  distinguishes  our 
work  from  those  by  Franke  and  by  Bradshaw  and  Young  is 
the  central  role  causal  knowledge  plays  in  CFRL.  We 
conjecture  that  causal  relations  are  an  essential  part  of 
functional  knowledge,  and  that  representadon  of  funcdonal 
knowledge  must  allow  explicit  descripdon  of  the  causal 
processes  involved.  Furthermore,  verification  of  a 
funcuon  must  ascertain  that  the  expected  causal  chain  of 
events  take  place,  since  the  satisfacdon  of  the  funcdonal 
goal  alone  does  not  necessarily  indicate  that  the  device  is 
funcuoning  as  intended. 

Because  the  semandcs  of  CFRL  is  defined  in  terms  of 
matching  between  a  behavior  and  a  functional 
speciftcadon,  the  language  is  immediately  useful  for  the 
purpose  of  behavior  verificadon.  We  have  designed  and 
implemented  an  algorithm  that  verifies  a  behavior 
prt^uced  by  the  DME  system  with  respect  to  a  funcdon 
specified  in  CFRL  as  defined  in  this  paper.  Initial  tesdng 
of  the  algorithm  has  included  verifying  the  funcdonal 
specificadons  of  the  EPS  as  given  above.  Care  must  be 
taken  in  designing  such  an  algorithm  to  assure  that 
exponendal  search  is  not  required  to  find  a  match  between 
a  trajectory  and  a  CPD.  We  are  currently  in  the  process  of 
analyzing  the  computational  complex!^  of  the  problem 
and  our  ^gorithm. 

We  expect  formal  funcdonal  specificadons  to  have 
many  uses  throughout  the  life  cycle  of  a  device  Gwesaki, 
et  al  1993).  For  example,  in  the  early  stages  of  the  design 
process.  desigMrs  often  do  'top  down’  design  by 
incrementally  introducing  assumptions  about  tevice 
structure  and  causality  relationships.  Such  design 
evolution  could  be  expressed  as  inaemcnial  refinements  of 
a  CFRL  functional  specification.  DME  could  assist  a 
desipier  in  this  functional  lefinciiieat  proceu  by  assuriag 
that  each  successive  specification  is  indeed  a  lefineaeent 
its  predecessor  so  that  any  device  that  satisfies  the 
refinement  also  satisfies  the  predeceaor. 
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Abstract^ 

When  designing  a  device,  the  Tinal  product  of  the 
design  process  is  usually  considered  to  be  a  physical 
specification  of  a  device.  However,  the  design  of  the 
causal  mechanism  underlying  the  physical 
speciHcation,  i.e.  how  the  device  is  intended  to  wwk 
to  achieve  its  function,  is  a  produa  just  as  impMtant  as 
the  physical  specification,  if  not  more.  Capturing  this 
knowledge  of  causal  mechanism  is  necessary  in  order 
to  understand  the  physical  specification  of  the  device 
as  well  as  to  evaluate  and  refine  the  specifications 
during  the  design  process.  Despite  the  importance  of 
such  knowledge,  existing  CAD  tools  do  not  supptxt  its 
explicit  representation  or  manipulation.  We  describe  a 
design  support  system  under  development  in  which 
knowledge  of  both  the  causal  mechanism  and  the 
physical  structure  of  a  device  being  designed  is 
explicitly  represented  and  manipulated.  The  system 
allows  the  designer  to  provide  functional 
specifications  at  various  levels  of  abstraction  in  a 
language  called  CFRL  (Causal  Functional 
Representation  Language).  The  CFRL  specification 
acquired  from  the  user  enables  the  system  to  evaluate 
the  physical  specification  as  it  is  being  develt^ted  in 
order  to  provide  useful  feedback  to  the  designer. 
Funhermore.  functional  specifications  provide  tut 
important  basis  for  recording  the  engineer's  design 
rationale. 

1  Introduction 

Understanding  how  a  device  works  le^nRS  undersanding 
the  causal  sequences  of  evena  that  acldeve  the  function  of 
the  device.  Books  on  how  maa-made  or  aanml ’devices* 
work  an  ftlled  with  drawings  of  smctoras  of  devices 
accompanied  hy  explanations  such  as  the  fbOowing  example 
of  an  aneroid  barometer  (Macaulay.  1988]: 


*TIm  rutsich  by  the  first  threa  authors  is  tupponra  in  part  by 
iho  Advneod  fUsoordi  Plejaea  Agaasy,  ARPA  Order  td07. 
BWHimrsd  by  NASA  Anns  gsissreh  Csnttr  andor  grant  NAO  2- 
Sll.  and  by  NASA  Ann  RMsaich  Cnem  undtr  giwt  NCC  2- 
S37.  Chtndnsdtannrs  reseawh  is  supperaed  tqr  the  Advansad 
Raaawch  Flqiaeis  Agtnay  by  maana  of  AFOSR  eanuaet  P<49d20- 
19-C-OnO  and  AFOSR  rm  *9-0230. 


As  the  air  pressure  falls,  the  spring  pulls  the  side  of  the 
capsule  outward.  The  arm  rises,  causing  the  rocking 
bar  to  slacken  the  chain.  The  hairspring  unwinds, 
moving  die  pointer  counter-dockwise  until  the  chain  is 
pulled  tauL 

Such  explanations  refer  to  parts  of  the  structure  and 
describe  how  it  works  in  terms  of  the  sequence  of  causal 
interactions  among  the  parts  that  lead  to  the  desired  result. 
In  other  words,  such  explanations  and  the  drawing  they 
accompany  describe  the  device  structura  along  with  the 
conceptual  causal  mechaiusm  underlying  the  structure. 

Whim  designing  a  device,  the  final  product  of  the  design 
process  is  usually  considered  to  be  a  ^ysical  specification 
of  a  device.  However,  the  derign  of  the  causal  mechmism 
underlying  the  physical  specificatioa  is  a  product  just  as 
importtnt  as  the  physical  specification,  if  not  more.  Gero 
makes  this  point  clear  in  his  model  of  the  design  process 
shown  in  Figure  1  [Oero,  1990].  Gera  argues  that,  except  in 
a  trivial  design  problem,  a  structure  is  not  generated  directly 
from  the  requirements.  In  his  model  of  design  synthesis,  a 
specification  of  the  expected  behavior  is  genoaied  fiom  the 
requirements^,  and  the  physical  structure  is  generated  fiom 
the  expected  b^vior.  InihatmodeLtheexpec^bdavior 
corieqxmds  to  the  expected  causal  sequence  of  interactions, 
i.e.  how  the  device  is  to  work. 


5^  Acnulbahaviota  DiDeti^emeagtieii 
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Capturing  this  knowledge  of  how  the  device  is  to  work  to 
achieve  its  function  is  important  in  order  to  understand  the 
physical  specification  of  the  device  as  well  as  to  evaluate 
and  refine  the  specifications  during  the  design  process.  One 
can  even  view  the  design  of  a  conceptual  mechanism 
as  driving  the  generation  of  a  physical  speciflcation. 
Despite  the  importance  of  such  knowledge,  existing  CAD 
tools  do  not  faciliute  its  explicit  representation  or 
manipulation. 

According  to  Gero.  the  design  process  involves  the 
following  types  of  acdvides  to  manipulate  the  four  types  of 
information  shown  in  Figure  1. 

Formulatioa:  Transforming  requirements  to  expected 
behavior.  F  -*Be 

Synthesis:  Transforming  expected  behavior  to  a 
structure.  Bg  -*  S. 

Analysis:  Deriving  behavior  from  a  structure.  S  -*Bs 

Evaluatioa:  Comparing  the  predicted  and  expected 
behaviors.  Bj  *-*Be 

Rcformuiation:  Reformulating  the  expected  behavior 
based  on  evaluation  results.  (S.B^-tBg 

Descriptloii:  Producing  a  description  of  the  designed 
structure.  5  -»D 

A  comprehensive  design  support  system  needs  to  siqiport 
multiple  iterations  through  all  those  steps,  since  any  non* 
trivial  design  problem  will  require  many  such  iterations. 
We  are  extending  our  Device  Modeling  Environment 
(DME)  system  [Iwasaki  and  Low.  1991]  to  provide  such 
support  and  to  enable  explicit  representation  and 
manipulation  of  knowledge  of  both  the  causal  mechanism 
and  the  physical  structure  of  a  device  being  designed.  The 
system  allows  the  designer  to  provide  functional 
specifications  at  various  levels  of  abstwtion  in  a  hngaage 
called  CFRL  (Causal  Functional  Repteseniatioa  Lai^uige). 
The  CFRL  specification  acquired  frm  the  user  ena^  the 
system  to  evaluate  the  physical  specificatioa  as  it  is  being 
developed  in  order  to  provide  useful  feedback  to  the 
designer.  Furthermore,  functkmal  specifications  provide  an 
important  basis  for  recording  the  engineer’s  design  rationale. 

This  paper  describes  the  use  of  CFRL  in  the  extended 
version  of  DME.  We  first  describe  CFRL  and  then  illustrate 
the  capabilities  of  the  new  system  by  presenting  a 
hypothetical  design  scenario.  We  conclude  with  a 
discussion  of  our  current  staois  and  retaied  work. 

2  CFRL 

CFRL  is  a  fbnnaliam  we  have  dev^oped  for  teprevating 
the  functions  and  etvecwl  behavior  of  a  device.  Itallows 
one  to  represeat  knowledfa  of  the  frnctioM  dMt  a  device  U 
intended  to  achieve  and  also  of  iha  seqiiancs  of  causal 
interactions  among  hs  oomponaMS  tiHi  lead  ihechieveiiieai 
of  the  functiwis.  In  CFRL,  the  frinctiOQ  of  the  ovetail 
device  is  described  fi«  and  the  behavior  of  each  cotBponsK 
is  described  in  terns  of  how  it  coturibiiiee  10  the  ftaciioa. 

We  have  provided  the  langnafe  with  a  weU-deflned 
semantics  in  terms  of  a  behavior  lepresenadonwi^  need 
in  model-baMd  qaaitaiive  rimnhaiBB  EVeeeaviec  dU  19931. 
and  we  have  used  CFRL  as  the  b^  for  a  frwgtionel 


verification  program  which  detennines  whether  a  behavior 

achieves  an  intended  function.  _ 

In  this  section,  we  give  a  brief  overview  of  CFRL.  CFRL 
is  fully  described  in  (Vescovi  et  al.  1993].  Before 
describing  CFRL,  we  present  the  device  that  we  will  use 
throughout  the  paper  as  an  example.  The  device  is  the 
electrical  power  system  (EPS)  aboard  an  Eanb-orbiting 
satellite  [Lockheed.  1984].  A  simplified  schematic  diagram 
of  the  EPS  is  shown  in  Figure  2.  The  main  purpose  of  the 
EPS  is  to  supply  a  constant  source  of  electricity  to  the 
satellite's  ot^  subsystems.  The  solar  array  generates 
ekcsicity  when  the  satellite  is  in  the  sun,  siqiplying  power 
to  the  lo^  and  recharging  the  battery.  When  the  satellite  is 
in  a  shadow,  the  battery  supplies  power  to  the  load.  Since 
the  battery  can  be  damaged  when  it  is  charged  beyond  its 
capacity,  the  charge  cutient  controller  opens  the  relay  when 
the  volbige  exceeds  a  threshold  to  prevent  die  battery  frmn 
being  over-charged. 


Figur*  2:  Elaarical  Powtr  Symm 

CFRL  allows  one  to  explicitly  stata  the  physical  context 
in  which  a  function  is  to  be  achieved  and  »  describe  a 
function  as  an  expected  causal  sequence  of  events.  Since 
die  concept  of  causal  inienctiota  among  components  is 
essential  to  the  understanding  of  a  frinction,  tbe  language 
^lows  explicit  represrotation  of  causal  imeractioos  and 
constraints  on  such  interactions.  Figure  3  shows  an 
example  of  tbe  repieseniaiion  of  an  EPS  fimaion. 

Formally,  a  frmction  F  is  defined  as  a  triplet  (Dp,  cf, 
GfA  where: 

denotaa  the  device  of  which  F  is  a  fiactioo. 

Cf  denoiBS  tbe  cootau  in  which  the  device  is  to 
ftmction. 

Gf  denotss  a  deacription  of  tits  ftmciional  goal  to  be 
acMaved. 

Tbe  notion  of  a  device  fimction  aanmaa  sobm  physical 
coniaxt  ht  which  the  device  it  plhcad.  aad  Cf  is  a 
ipecifiaiiion  of  Mdi  •  cooMt.  Cf  oomistt  of  two  pMs.  a 
sstofobiaciiaadtiitofCQofMiaaatndKMobiacti.  The 
conditioae  nuMhold  dMonghtmta  hahevior  hi  oidv  for  die 
fimction  ID  be  anirfW  bp  iw  bahevtar. 


146 

I 


Gp.  (he  goal  to  be  achieved  by  the  function,  is 
tepresented  as  a  Boolean  combination  of  Causai  Process 
Descriptions  (CPDs).  Each  CPD  is  an  abstract  description 
of  expected  behavior  in  terms  of  a  causal  sequence  of 
events.  The  d)siracted  behavior  is  represented  as  a  directed 
graph  in  which  each  node  describes  a  state  and  each  arc 
describes  a  tempnal  and  (optionally)  a  causal  relation 
between  statts. 


Dp:  ?eps:  Elecirical-power-system 
Cp:  Object-set:  ?sun  Sun ,  Condiuon:  T 
Gp:  (ALWAYS 

(IF  (AND  (>  (Electrotnotive-fotce 

(Battery-component  ?eps)) 
33.8) 

(Qosed-p  (Relay-component  ?eps))) 
THEN(3»D3)) 


CPD3: 


ol:  (>  (Elcciromotivc-fatce(Bitteiy-oMiqionnu  ?eps))  33.8 


I 


causal.  <  I  (by-ftmction-of  (Controllar-coapaaeat 


(Open-p  (Relay-component 


causal  < 


(<  (d(Sioted-ch«ge  (Batuay-componcnt  ?eps))Mt)  0. 


Figure  3  :  EPS  Function  in  CFRL 

A  node  specifies  a  condition  on  a  state.  The  condition  is 
a  logical  sentence  that  must  hold  in  a  state  of  the  world  at 
some  time  using  the  variables  defined  in  Dp  and  Cp.  The 
arcs  in  a  CPD  are  directed  and  specify  temporal  and  causal 
relations  among  nodes.  An  arc  has  the  following  attributes: 
causal-flag:  An  indicator  of  whether  the  relationship 
between  the  states  described  by  the  source  and  destinttion 
nodes  is  causal 

tcmporal«relMiiM:  ■,  <.  or  ^ .  indicaiing  the  temporal 
relatioa  between  the  stMee  deacribed  by  the  source  and 
destinatioa  nodes.  ■  means  dun  the  sates  deacribed  by 
the  two  nodes  SR  to  be  the  same  saa,  <  means  da  saa 
described  by  the  amnce  node  must  siiicdy  precede  the 
staa  described  by  da  deednsrion  node,  and  I  means  the 
staa  described  by  the  source  node  mtat  either  be  the 
Sana  as  or  precede  the  saa  daaciibed  by  the  deednatioo 
staa. 

f  ^«am*Jnsilflcntkin:  If  an  sic  is  “causal”,  one  can  attach  a 
justification  for  the  causal  idatioa.  A  jusdflcsdon  takes 
the  form  of  a  Boolean  combination  of  da  pradicass  by* 
ftinctlonHif  (<componetM>)  and  wilh*partlcipatiott*ef 
(<componan»).  Those  preiBcaas  are  used  to  apacUy  da 
patticipaiionofadaviGecompoiaminapatticulapoinioo 
ofacausalpiooeas. 


Because  the  semantics  of  CFRL  is  defuied  in  terms  of 
matching  a  functional  specification  a  a  device  behavior,  the 
language  is  immediaaly  useful  for  design  verification.  In 
Figure  4,  we  show  an  example  of  matching  a  CPD  a  an 
EPS  behavior  genetaed  by  a  qualitative  simulaar  (Iwmdd 
and  Low.  1991].  The  left  pottioa  of  Figure  4  shows  part  of 
an  EPS  function.  (The  notation  has  b^  changed  slightly 
for  lepbility.)  The  right  portion  shows  part  of  a  predicted 
behavior  for  EPS.  Since  the  IF  part  becomes  true  at  state 
s4,  r4  is  matched  against  node  ni.  Nodes  a2  and  ni  match 
with  sS  and  s6.  The  temporal  constraints  on  the  arcs  are 
clearly  since  s4<s5  <s6.  The  way  this  portion  of 

the  behavior  was  simulated  shows  the  existence  of  the 
following  causal  padi  involving  the  controller  and  the  relay: 

Volt^-battery  >  33.8  ls4}  =»  Tum-signal-oo  ls4} 

=»  Signal-control «  on  ls5]  =»  Relay-open  [s6l. 

Thus,  as  required  by  the  arcs  in  the  CPD,  the  functioning 
of  the  controller  plays  a  role  in  causing  the  condition 
specified  in  node  nA  to  become  true  and  the  fonctkxiing  of 
the  relay  plays  a  rede  in  causing  the  condition  specified  in 
node  n3  to  become  true. 


Figure  4:  Exanpie  of  macchiag  a  CPD  le  a  behavior. 

3  A  Design  Support  System 

A  compieheasivu  design  support  qrmam  naeds  u  support 
miiittpiti  ileraiioas  ir«»"«^gb  aU  the  smpsof  dmigB,  mclutfing 
formnlKioii,  syodmaia.  analysis,  svaluaiion.  lulBcmolaiioo, 
and  descfipiion,  since  any  non-irivial  design  problem  will 
leQuke  many  such  iwaiions.  Ihnthsnnom,  even  though  the 

mig  HI  m  HIOOBIe  IHQH^IHuBl^^m 

me  mely  completely  known  a  priosL  Evan  known 
reqiaiBaaenis  oftm  dungs  dnaing  the  des^n  a 

providon  not  allowed  Ibr  in  mom  fbimal  modds  of  the 
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design  process.  During  the  process  of  generating  and 
evaluating  alternative  designs,  the  (tesigner  may  discover 
unforeseen  consequences  of  given  requirements  or  features 
of  the  environment  that  have  ejects  on  the  performance. 
As  a  consequoice,  the  requirements  are  likely  to  change 
many  times.  A  practical  design  environment  must  support 
such  incremental  acquisition  and  evolution  of  the 
requirements.  Our  goal  is  to  develop  a  design  suppwt 
system  that  can  focilitate  all  these  activities 

•  Providing  languages  for  describing  both  device  structure 
and  functional  specifications  at  various  levels  of 
abstraction. 

•  Supporting  incremental  acquisition  of  both  constraints 
that  must  be  satisHed  by  the  device  and  properties  of  the 
environment  in  which  the  device  is  expected  to  operate. 

•  Providing  analysis  tools  for  evaluating  the  design  through 
simulation  and  for  verifying  that  the  simulated  behavior 
satisfies  the  functional  specification. 

We  view  the  design  process  as  one  of  patalld  refinement 
oftwotypesofspedficuions:  phyacal  and  functionaL  The 
two  specifications  ate  closely  linked  in  many  ways.  The 
functional  specification  drives  devefopment  of  the  physical 
specification  by  providing  the  goals  to  be  achieved.  The 
functional  specification  refers  to  parts  of  the  physical 
specification  in  order  to  explain  how  different  components 
are  to  interact  with  each  other  to  achieve  the  overall 
function.  The  physical  specification  must  include  at  least 
all  the  components  that  are  referenced  in  the  functional 
specification  and  all  connections  that  enable  them  to 
interact.  Analysis  of  the  behavior  of  the  physical  design  in 
turn  may  suggest  additional  functions  thu  are  needed. 
Finally,  the  fiiKtional  specification  provides  an  important 
part  of  the  rationale  for  the  physical  specification  as  well  as 
the  evaluation  criteria  for  design. 

The  following  section  describes  the  design  environment 
through  a  hypothetical  design  scenaria 

3.1  Hypothetical  Design  Scenario 

The  given  task  is  to  design  an  electrical  power  supply  (EPS) 
for  an  Earth  orbiting  satellite.  The  top-level  g<^  is 
expressed  in  CFRL  as  shown  in  Figure  5. 


0|;:  ?epK  Eleeirical-pawcr-syiura 

Cp:  Object-sci:  ?iun:  Sun.  Tk  Elacaical-hMd 

CondiiionKT 
Gp:  (ALWAYS  CPOO) 


Pigine  S:  The  iap>lBvel  foaeden  of  BPS 


It  says  that  dw  power  anppiy  feaniH  power,  snivlyiai 
electricity  10  the  load.  Tha  physical  gpeciffcaiioo  at  this 
poim.tiiowninnBun6kis«ecysiaipla.  bsiMwsttaidwe 
is  an  eiecaical  load  coaoseild  to  a  powecHnpply. 


Rgure  6:  Top-tevel  physical  specification  of  EPS 

The  designer  first  decides  to  refine  the  power-sup|dy 
component  in  Figute  6  by  selecting  a  specific  class  of 
devices  as  a  power  supply.  The  system,  using  its 
comptment  libtsy,  displays  die  following  options: 

Power-supply  options:  battety,  solar  array,  thermal 

generator 

The  user  selects  "battery”.  The  physical  specification  is 
refined  to  include  a  battery  as  the  power  supply.  The  user 
wants  to  see  the  behaviotal  implk^ions  of  this  selection. 
The  systems  p^onns  qualitative  aimiiiarinn  of  this  absiraa 
physical  spwification  using  iu  generic  knowledge  of 
batttries.  The  simulation  results,  as  shown  in  Figure  7, 
indicate  that  the  battery  will  run  out  of  power  at  some 
unspecified  time  r,  after  udiich  the  load  will  not  be  supplied 
power. 


The  system's  verification  module  coaqMits  this  piedicKd 
behavior  with  die  top-level  funcdonal  specification  and 
mfonns  the  user  that  the  function  is  not  sttisfied  since  the 
battery  finis  ID  supply  power  after  some  date.  Finthennore, 
causal  analysis  the  verification  mothrie  ravenli  that  the 
fiulure  to  provide  power  canatlly  depends  on  the  capKiiy  of 
the  battery  as  well  as  dm  power  demand  by  the  loal  Theae 
analysis  lesnlis  prompt  ihe  near  to  refian  dm  tacdonal 
ipecification  by  dmn^ng  '‘ALWAYS"  to  ‘^br  t  <  20  yanrs” 
and  by  addiin  the  constraint  dtm  the  eiectricai  had  it  100 
wtuif  and  CQtmnc 

The  user  now  asks  dm  sysmm  to  describe  qrpss  of 
bsaeries  ihsi  can  aaiisfy  dieae  coimminte.  The  sysmm, 
uriof  ks  bmmiy  kaowle^  baro.  deacribae  pomMs  Qfpes  of 
beoefiea  tfam  can  meet  dmee  mqiiiimnemi  end  Bsm  their 
aitriiNmi,Meli«dmG^Kiiiaai^mashnro.  Oneofdm 
listed  anftnme  is  weighi,  wUch  dm  lyamro  comonme  wBl 
be  between  500  lb  and  JO A)0  If  d^NMdtaf  mt  dm  type  of 
bmmrychoma.  Tim nmr docUM dmi dds ie mmeoapmUe 
for  nee  in  a  lemllim.  tte  adds  a  new  oan«rint  tet  dm 


weight  of  the  EPS  must  be  less  than  50  kg  and  rules  out 
using  a  battery  as  the  power  simply. 

The  user  now  renims  to  the  list  of  alternatives  supplied  by 
the  system  and  sdects  a  generator  as  the  power  supply.  The 
component  knowledge  base  oontains  knowledge  about  the  a 

priori  requirements  of  each  type  of  component  for  its 
operation.  In  the  case  of  generators,  the  requirements 
include  the  availability  of  some  type  of  fuel  such  as  oil. 
coal,  gas,  or  uranium.  Since  the  system  does  not  know 
enough  about  the  environment  in  which  the  EPS  will 
operate  to  determine  if  any  such  fuel  is  available,  the  systm 
displays  this  requirement,  which  leads  the  user  to  qtecify 
the  environment  constraint  that  there  is  no  fuel  and  to  rule 
out  use  of  a  generator. 

Similarly,  the  user  considers  solar  amys  and  is  presented 
with  the  requirement  that  sun  light  be  available.  The  user 
aAt<  the  environmental  constraint  that  sun  light  is  available 
only  for  70  minutes  out  of  every  100  minutes  and  rules  out 
solar  arrays. 

Since  none  of  the  options  for  a  power  supply  presented  by 
the  system  fulfills  the  functional  goal,  the  dekgner  decides 
to  combine  batteries  and  solar  arrays,  so  that  the  solar  array 
can  recharge  the  battery  as  well  as  provide  power  while  the 
sun  light  is  available,  and  the  battery  can  provide  power 
when  sun  light  is  not  available.  She  modifies  the  functional 
and  physical  specifications  accordingly.  She  further  refines 
the  physical  specification  by  choosing  a  nickel-cadmium 
battery.  These  steps  produce  the  specifications  and 
constraints  shown  in  Figure  8a  and  8b. 


Eavtronmcntal  ceastnintt 
Fmevtiy  lOOnumitts, 
tua-U^  ii  availabi*  for 
70iiriiiutai, 

,  -  $« :  FueK*) 


Figure  8b:  Refiiwd  couiraiiits 

At  this  point,  the  designer  decides  to  check  te  design  by 
simulating  its  behavior.  Predicted  behavior  indicates  the 
possibility  of  the  battery  heating  up  and  eventually 
becoming  damaged.  Noticing  the  battery  heating  up 
reminds  the  user  that  the  sdendfo  equipmertt  on  board  the 
satellite  should  not  be  exposed  to  high  temperature.  She 
adds  the  functional  constraint  that  the  temperature  of  the 
EPS  should  not  exceed  40*  C.  Causal  analy^  of  the  states 
in  which  the  battery  hems  up  and  becomes  damaged  reveals 
that  overcharging  of  the  battery  is  the  cause. 

The  designer  modifies  the  design  again  to  include  a 
controller,  whose  behavior  she  spedfies  to  be  opening  the 
connection  between  the  solar  array  »d  the  rest  of  the  circuit 
when  the  battery  becomes  fully  charged  and  closing  it 
otherwise.  This  new  design  is  shown  in  Rgurn  9a  and  9b. 
It  also  shows  the  functional  specification  is  refined  to 
include  the  function  of  the  controller.  The  lower  pan  trf  the 
functional  specification  shows  the  behavior  of  the  controller 
as  specified  by  the  designer. 


Fiiacttoul  eoatralBls 
load  a  100  waof,  constant! 
(Weight  ?EPS)<  SO  kg 


^  .  "  ^ 

Faaetioaal  Spec 

GF:  (AND  (ALWAYS  (if  (AND  -KFully-chBged  ?s) 

(Sliiiting-p?iua)) 

ihanCPDT)) 

(ALWAYS  Gf  (OR  (Fnlly-chsiod  ?x) 
-<Sltiiaag-p  ?iua)) 
tbenCPDS))) 

enn: 


Figunta:  Refined  ftaictional  and  physical  rperiflcari— 


Fi|w«9b:  niysicalsp*cificaiionc«Mtaittiii|aconBolkr 

Simulating  (his  new  design  reveab  chattering  behavior  of 
the  controller.  The  designer  modifies  the  threshold  for 
closing  the  controller  in  CPDS  to  cl  such  that  cl  <  cO  to 
eliminate  chattering. 

3.2  Summary 

To  summarize,  the  design  support  system  described  in  this 
section  has  the  following  features: 

•  Allows  spedficatioo  at  various  levels  of  abstraction  of  the 
physical  structure  and  the  intended  functionality  of  the 
device  being  designed.  The  functionaliQr  is  represented 
using  CFRL  and  describes  what  the  device  is  to  achieve 
and  how. 

•  Aids  analysis  of  the  design  by  providing  simulation 
facilities  which  can  automatically  formulate  a  simulatioa 
model  of  a  given  design  and  pr^t  its  behavior.  At  an 
early  design  suge,  simulation  can  be  performed 
qualitatively  to  uncover  possible  undesirable  behaviots. 
The  system  also  iM-ovides  causal  analysis  of  the 
undesirable  behaviors  to  help  refine  the  design. 

•  Aids  evaluation  of  the  design  by  a  verincation  facility 
that  compares  a  predicted  b^vior  against  the  functional 
specification  in  CFRL  to  determine  whether  the  predicted 
behavior  achieves  the  desired  function. 

•  Aids  in  selection  of  components  by  presenting 
alternatives  and  their  chancmr^tici,  using  its  component 
library  indexed  by  function. 

•  Allows  incremental  acquisition  and  refinement  of  various 
design  requiremenu.  including  functional  and 
enviroomeaml  oomaaintt. 

The  final  products  of  a  design  process  using  this  design 
environmeni  include  physical  and  fanctional  spedfications 
as  well  as  a  eompkiB  lilt  of  requiteniems  and  a  description 
of  the  operating  enviTOHnianL  Tbs  Aactional  specifioiion 
should  inclnde  a  d*T»*p*»*w  of  what  the  device  is 
expected  to  do.  how  it  is  expected  a>  do  it.  and  under  what 
circumstances.  FtiRhermoie.  the  history  of  the  design 
*  process  wifi  be  maintained  in  order  to  eaMe  post«fiKio 

reconsatietioo  of  dm  design  rationale. 


refinement  of  functional  specifications  along  wiih  phys 
specifications.  CFRL  is  used  to  express  the  intenoea 
function  of  the  device  m  be  dmig^  and  how  the  function 
is  to  be  achieved  through  causal  interactions  of  its 
componems.  Knowing  of  the  fimction  of  the 

device  allows  the  system  to  evaiume  the  design  of  the 
physical  nncture  dwmg  the  design  process  to  provide 
usefid  feetfiack  to  the  derigner  even  at  an  earty  stage  of  (he 
design. 

The  described  design  support  system  is  being 
implemented  as  m  fxmnsion  of  dm  existing  DME  system. 
DME  abeady  contains  model  formnlatioo  and  simulation 
focilities.  a  device  stracture  editor,  and  an  explanation 
fscOity  [Gruber  and  Oantier.  1993].  An  efficient  algorithm 
for  antommically  formulating  an  qqnoprinte  simulation 
model  for  a  given  query  has  been  developed  and 
implemeamd  [Iwasald  and  Levy.  1993].  We  have  also 
implemeamd  a  behavior  vetifintioa  {nogram  based  on 
CFEU.  (VescovietaL  1993]. 

As  Oho  points  out,  derigning  dm  eqmcted  behavior.  U.. 
what  the  de^  is  to  achim  and  how,  is  an  inqxxtam  part 
ot  the  entire  design  process.  Furthermore,  in  practical 
design  ptobiemt.  dm  reqniremeats  ate  not  fully  known 
init^y  but  they  an  modified,  angmenmd,  and  refined 
during  dm  design  procea  partly  in  response  to  analysis 
results  of  the  daign  being  developed.  Therefore,  a  design 
environment  must  support  such  incremental  acquisition  of 
lequiremeoBM  well  as  fimminnal  specifications. 

A  design  support  system  needs  to  maintain  a  histoqr  of 
the  design  process,  molding  developmem  of  the  functional 
jMMi  physical  of  requiiemeiits. 

Hialytis  performed  of  intermediate  desigw.  inchiding  the 
desoiUe  or  undesimbln  behnviom  diacovend  whidi  led  to 
fiirtherrefinemem  of  dm  design.  Such  n  record  can  be  used 
later  u  feconstnict  not  only  mdonale  for  dm  design  of  the 
physical  structure,  bat  also  rationale  for  lequireinents  and 
functional  qmcificatiaas. 

There  hu  been  significant  previous  work  on  both 
repteseatation  of  Aaction  and  its  use  in  reasoning  about 
physicaidBvicus.  CTOLisbusedoadmwoifconFtiictionai 
Represeatatioa  [Sembugamoonhy  and  Omndcudcann. 
1986],anditisnftndmruxiBnsionof  dm  woApwaentedin 
(Iwasiki  and  Chandresskaimi,  1992].  We  have  extended 
dm  expressive  power  of  the  Imgtmtft  deacritmd  in  those 
papers,  and  ham  provided  a  formal  foondadon  of  the 
semantics  of  dm  lingnige  to  nmka  possiblB  its  use  for 
verification  of  design.  Rmoka  [199I]L  vd  Bmdshnw  and 
Yotag  {1991}  also  ripwaa  the  hwsaded  flacthm  in  a 
maamrsiaiilKaRactioaalRepnaaiadoB.  AaiaiMnam 
chsracmristic  dmi  diatingaisbas  CFRL  boa  the  woik  of 
Firanfee.  Bmddmw,  and  Yoaig  is  dm  rale  cmmal  kaoudedge 
plays  in  cnUL  and  in  aa  in  vesificadon.  Onrwakis 
baaed  on  dm  cot^Man  dmi  oeal  idahM  am  M  easaiial 
part  of  faaedonal  kaewindga,  aad  ih«  explicit 
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discussed  by  Chandrasekaran  et  al  [1993].  As  such,  the 
system  we  have  described  concenuaies  mostly  on  acquiring 
information  about  the  structural  and  behavkxal  aspMts  of 
the  device  to  be  designed,  but  does  not  handle  the  full  range 
of  possible  types  of  design  lationale.  Them  has  been  much 
work  on  develt^ent  of  systems  for  design  rationale 
capture  [Kl^,  1993;  RscheretaL  1991],  In  contrast  to  our 
work,  most  of  them  concentrate  on  capturing  the  second 
type  of  rationale  in  the  form  of  argumentation  structure 
(pm  and  cons)  for  individual  features  trf  the  design.  Few 
offer  a  representation  formalism  expressive  enough  to 
capture  detailed  functional  knowledge  of  how  the  design  is 
intended  to  achieve  its  goal,  nor  do  they  treat  the 
development  of  the  functit^  specification  as  a  distinct 
activity  in  the  design  process.  We  believe  that  a 
comprehensive  design  suppon  system  must  be  able  to 
represent  and  reason  about  both  types  of  informatioo.  and 
we  intend  to  extend  our  system  to  handle  the  full  range  of 
information  involved  in  device  design. 
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1  Where  I  Agree 

I  have  followed,  from  a  distance  bom  of  partly  convergent  and  partly  divergent  goals,  the 
research  that  has  gone  on  in  the  name  of  “Qualitative  Physics”(QP).  The  term  QP  is 
normally  taken  to  mean  reasoning  about  the  physical  world.  A  good  deal  of  this  work, 
however,  has  concentrated  on  prediction  of  behavior  of  physical  configurations  for  which 
equationai  models  from  Physics  can  be  written  down  in  a  tractable  way  (in  contrast  to, 
say,  complex  biological  systems).  Another  part  of  this  work  has  gone  on  in  the  domain  of 
modeling  functions  and  malfunctkms  of  devices,  and  causal  processes  that  participate  in 
them.  There  has  also  been  a  body  of  work  that  has  been  called  “naive  physics,”  an  attempt 
at  modeling  the  commonsense  knowledge  of  the  physical  world. 

Sacks  and  Doyle’s  paper  (hermafter  SlcD)  is  very  useful  in  bringing  to  light  the  diffi¬ 
culties  with  in  one  type  of  QP  research,  namdy,  prediction  of  behavior  of  certain  types  of 
physical  systems.  While  I  find  the  technical  points  made  by  SicD  seem  to  be  instructive 
overall,  I  think  that  the  vision  that  they  outline  for  a  “new”  QP  needs  to  be  significaatly 
broadened. 

Let  me  first  review  what  I  take  to  be  the  main  technical  point  of  the  p^wr.  For  the 
behavior  prediction  task,  the  physical  systmn  is  modeled  as  a  vector  of  state  variables  of 
interest.  ModaBag  iadudes  ^edficatioa,  for  each  variable,  of  how  a  change  in  that  variable 
affects  other  variables,  spedically  what  changes  in  other  variables  fbQow.  (For  the  kinds  of 
systems  considered,  these  reUtians  are  acansal,  but  if  one  part  of  the  relation  is  taken  to 
be  the  canee,  the  other  is  the  effect.  For  exam^,  in  Newton’s  Law,  F  »  m  a,  if  the  force 
is  taken  as  the  cause,  the  acceleratios  can  be  viewed  as  the  eifoct,  and  vice  versa.)  The 
physical  system  b  characterised  by  this  (cansal)  model,  which  I  shaO  call  ii.  The  b^vior 
prediction  task  is  to  calculate  or  infer  tte  valnes  of  all  the  dependent  variables  as  changes 
in  tome  of  the  independent  variables  are  'initiated. 

When  the  variaUes  are  real  nnmbers  and  the  reiatioaship  betwott  the  changes  in  valnes 
of  the  variablea  is  known  completely  and  is  ^ven  as  a  differential,  the  dtaation  is  fUDy  char* 
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acterized  by  a  set  of  simultaneous  differential  equations.  Classical  Physics  teaches  us  how 
to  set  up  such  differentials  for  many  physical  systems.  The  theory  of  differential  equations 
provides  the  calculus  by  which  behavior  can  predicted.  This  is  all  well-estsd>lished  as 
part  of  applied  physics  and  mathematics. 

However,  the  version  of  the  prediction  problem  that  QP  theory  attempts  to  solve  is  one 
where  the  input  is  known  only  qualitatively,  or  when  the  physical  system  itself  is  only  qual¬ 
itatively  characterized.  Under  these  conditions,  we  need  new  techniques  by  which  behavior 
can  be  predicted,  and  several  investigators  have  proposed  such  techniques  (which  S&D  la¬ 
bel  SPQR  techniques),  .^s  SkD  show,  while  there  are  differences  between  the  proposed 
techniques,  all  of  them  represent  the  relations  between  variables  in  some  qualitative  form 
(signs  of  changes  around  a  ‘^normal”  value  or  monotonic  relations).  (That  is,  M  —  we  can 
now  call  it  —  contains  only  qualitative  rdations  between  state  variables.)  There  are 
corresponding  proposals  for  calculi  for  prediction  of  behavior.  These  calculi  share  a  style  of 
inference  in  which  the  changes  in  the  values  of  independent  variables  are  propagated  using 
the  relations  in  until,  to  the  extent  the  underljring  ambiguities  allow,  the  effects  on 
all  the  variables  are  obtained. 

S&D  show  that  the  calculi  that  have  been  offered  in  QP  research  do  not  perform  par¬ 
ticularly  well.  They  claim  that  if  one  has  a  qualitative  equation,  then  qualitative  analysis 
using  knowledge  of  advanced  dynamical  analysis  can  produce  better  results. 

2  Why  Do  the  QP  Calculi  Have  Trouble? 

What  characteristic  of  the  QP  methods  g^ve  them  this  disadvantage?  SIcD  blame  the 
qualitative  language  itself,  that  it  misses  rdevant  distinctions.  Of  course  missing  tome 
distinctions  comes  with  the  territory  for  qualitative  languages.  But  there  it  another  reason 
for  the  difficulties  faced  by  the  QP  calculi:  the  sequence  inferences  mirrors  the  sequence 
of  local  causal  interactions  of  the  variables  as  described  by  the  rdations  in  the  qualitative 
model,  That  is,  prediction  is  performed  by  literally  '*simnlating”  the  system  using  the 

relations  in  Sack’s  dynamical  system  analysis  is  not  restricted  by  this  property,  it  uses 

“^obal”  techniques  which  Erectly  generate  qualimive  analyses  of  the  solution  space.  These 
techniques  in  some  sense  use  much  more  knowledge  of  the  mathematics  of  dynaotical  systems 
than  is  available  in  or  the  QP  calculi.  (A  rimilar  rituathm  pertains  to  the  causal 
ordering  analysis  of  lunuahl  and  Simon  [Iwasaki  and  Simon,  19M]  versus  that  by  deKleer 
and  Brown.  The  latter  arrives  at  the  cmual  ordering  by  fbUowlag  a  series  ot  local  causal 
transitioas,  whOt  the  former  is  a  global  aaalysb  ct  dependencies.)  Others  have  recognized 
the  importance  of  this  property  of  the  QP  c^uli:  in  [Forbus,  19M,  Kulpcrs,  19M]  similar 
observations  about  the  dis^cUon  between  information  low  in  the  analysis  and  the  low  of 
causality  in  the  system  are  made. 

There  is  also  an  additional  problem  in  the  QP  calcuB:  only  contains  information 

about  the  specilc  physical  system,  the  relatioas  betwera  variaUes  are  at  one  levd  of  abstrac¬ 
tion,  and  the  calculi  are  relatively  impoverished  in  inferential  power.  AH  human  reasoning 
takes  place  with  the  benelt  of  substantial  back^ound  knowing  about  other  abstractimu 
and  generalization  rules.  For  exampk,  a  human  m^t  be  aUe  to  use  the  kMudedge  that 
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reaching  the  same  state  again  and  again  means  that  the  system  is  in  a  cyclic  state,  and 
may  make  the  inferential  jump,  even  without  advanced  mathematical  knowledge  about 
dynamical  systems. 

The  QP  calculi  seem  to  have  been  designed  under  some  implicit  constraints,  namely, 
that  they  display  some  of  the  perceived  properties  of  human  reasoning  about  the  physical 
world:  that  humans  often  appear  to  combine  causal  relations  recursively,  and  in  cases  where 
they  have  the  structure  of  the  physical  system  available,  trace  the  topology  of  the  physical 
system  to  follow  the  "flow  of  causality."  I  will  argue  that  QP  techniques  should  aim  to  use 
the  heuristic  power  of  human  reasoning  even  more,  while  employing  the  power  of  formal 
analysis  to  clearly  defined  subproblems  where  such  techniques  are  needed.  Thus  the  issue  is 
broadened  to  include:  What  should  the  connection  of  QP  research  be  to  human  common- 
sense  knowledge  and  reasoning  about  the  physical  world?  Is  Newtonian  physical  modeling 
sufficient  for  QP.  or  necessary  for  all  the  goals  QP?  If  one  were  only  intmsted  in  pro¬ 
ducing  a  technology  that  assists  in  reasoning  about  the  physical  world,  can  one  develop  this 
technology  without  to  some  degree  being  concerned  with  human  commonsense  knowledge 
and  reasoning  methods?  My  concern  is  to  ensure  that  qualitative  physics  research  has  a 
significant  place  not  only  for  mathematically  sophisticated  analysis  techniques  as  SJfD  pro¬ 
pose,  but  also  for  a  whole  spectrum  of  issues  concerning  the  sources  of  the  power  in  human 
reasoning  about  the  physical  world. 

3  Human  qualitative  reasoning  about  the  physical  world 

A  trained  physicist  and  an  unschotded  man-on-the-street  start  with  a  common  ontology 
and  a  shar^  cognitive  architecture.  The  physicist  learns,  and  may  add  to,  a  specialized 
ontology  as  well,  and  acquires  a  number  of  modding  and  analytical  techniques.  We  need 
to  sort  out  these  distinct  types  of  knowledge  about  the  physical  world  that  come  into  play 
in  human  reasoning. 

1.  A  commonsense  ontology  which  predates  and  is  in  fact  used  by  modem  science:  space, 
time,  flow,  physical  objects,  esutse,  state,  perceptual  primitives  such  as  slu^es,  and  so  on. 
The  commonsense  ontology  also  comes  with  some  terms  that  are  given  specific  technical 
meanings  by  Kience,  but  in  geBcral  the  terms  m  this  ontology  are  expericntially  and  lo^cally 
so  fundamental  that  sdeatifle  theories  are  built  on  the  infrastructure  of  this  ontology.  Early 
work  in  QP  had  as  a  main  fool  elabocation  of  such  an  oatotogy  ([Hayes,  1979,  Forbus,  1984] 
are  examid«).  Even  today,  a  good  deal  of  QP  research  grapplm  with  the  development  of 
ont<^)gies  for  diffwent  parts  of  commonsense  physical  knowledge. 

2.  The  $ekntifie  ontolsyy  is  built  on  the  commmisense  ontology  (and  often  gives  specific 
technical  meanings  to  some  of  the  terms  in  it,  such  as  ’^force”).  Additional  concepts  and 
terms  are  constructed.  Some  of  these  are  quite  outside  commonsense  experience  (exam^ 
arc  “voltage,”  “currmt,”  and  "charm  of  quarks”). 

3.  CompiUi  ceusef  knowki§t.  People  compile  caasal  expectations  partly  from  direct 
experience  and  partly  by  caching  some  results  from  earlier  problem  solving.  Which  causal 
expectations  get  stor^  and  used  is  largely  detenrined  by  the  relevance  of  the  causes  and  ef¬ 
fects  to  the  goab  of  the  problem  solver.  There  is  a  more  organized  form  of  causal  knowledge 
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that  we  build  up  a«  well:  models  of  causa/  processes.  By  process  model  I  mean  a  description 
in  terms  of  temporally  evolving  state  transitions,  where  the  state  descriptions  are  couched 
using  the  commonsense  and  scientific  ontologies.  For  example,  we  have  commonsense  causal 
processes  such  as  '‘boiling,"  or  specialized  ones  such  as  “voltage  amplification."  “the  busi¬ 
ness  cycle,"  and  so  on.  These  are  not  neutral,  agent-independent,  process  descriptions, 
but  ones  in  which  the  qualitative  states  that  participate  in  the  description  have  been  cho¬ 
sen  based  on  abstractions  of  interest  to  the  agent.  In  particular,  such  descriptions  are 
couched  in  terms  of  possible  intervention  options  on  the  world  to  affect  the  causal  process, 
or  observations  to  detect  the  process.  Forbus’  processes  [Forbus,  1984]  and  my  and  my  col¬ 
leagues’  work  on  functional  representations  [Sembugamoorthy  and  Chandrasekaran,  1988, 
Keuneke,  1991,  Goel,  1989,  Sticklen  and  Tufank^,  1991]  are  examples  concerned  with  the 
development  of  representations  for  causal  processes. 

When  the  process  model  is  based  on  pre-scientific  or  unscientific  views,  we  have  naive 
process  models  (such  as  models  of  sun  rotating  around  the  earth,  or  of  exorcism  of  evil 
spirits).  Many  pre-scientific  process  models  are  not  only  quite  adequate,  but  are  actu¬ 
ally  sirppler  and  more  computationally  efRcient  than  the  the  scientific  ones,  for  everyday 
purposes. 

These  process  descriptions  are  great  organizing  aids:  they  focus  the  direction  of  predic¬ 
tion,  help  in  the  identification  of  structures  to  realize  desired  functions  in  design  [Goel,  1989], 
and  suggest  actions  to  enable  or  abort  the  process. 

4.  Mathematical  equations  embodying  scientific  laws  and  expressing  relations  between 
state  variables.  These  equations  themselves  are  acausal,  and  any  causal  directiem  is  pven 
by  additional  knowledge  about  which  variables  are  exogeneous. 

4  Some  of  the  Things  That  A  New  QP  Should  Include 

It  is  generally  agreed,  including  by  S&O,  that  a  QP  theory  or  framework  should  provide 
support  for  three  components  of  reasoning  about  the  physical  worid:  modeling,  prediction 
and  control.  In  fact,  a  weakness  of  th«r  paper  is  that  th^  pay  only  lip  service  to  the 
problem  of  modeling  and  fail  to  show  why  or  how  dynamic  analysis  will  hdp  solve  that  and 
the  control  problems.  With  the  recent  exception  of  SPQR.  caknli,  qnite  a  bit  of  the  work 
in  QP  research  is  concerned  with  the  development  of  ontolo^es,  whi^  are  directly  rdevant 
to  the  modeling  problem.  Since  1  expect  other  respondents  to  oatCne  precisely  how  the 
QP  field  is  paying  attenthm  to  them  problems,  I  will  concentrate  on  those  aspects  of  the 
problem  nnlikaly  to  be  emphasised  by  them. 

4.1  Modnling 

All  OMdeiing  is  done  in  the  context  of  goals  to  be  accomplished,  i.e.,  states  to  be  achieved 
or  avoided  in  the  world.  The  heart  of  the  modeling  problem  is  to  map  from  goals  to 
tractable  representations.  Compiled  cansal  knowledge  (see  dlscnssion  in  Section  3)  plays 
an  essential  role  in  identifying  aspects  of  the  physical  sitnation  and  perspectives  that  need 
to  be  represented.  The  cansal  process  models  can  be  nsed  to  identify  states  that  should 
be  represented  and  reasoned  abMt.  Ghren  a  physical  sitnation  and  godb,  cansal  processeT 
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whose  results  are  relevant  to  the  achievement  of  the  goals  are  retrieved  and  used  as  a 
guide  in  modeling  the  situation.  The  aggregation  levels  (when  dealing  with  populations) 
[Weld.  I9S6],  the  abstractions,  the  approximations  and  the  concepts  in  the  representation 
are  all  jointly  determined  by  the  physical  situation,  the  goals,  and  the  rich  storehouse 
of  causal  process  knowledge  that  expert  reasoners  possess.  Progress  in  modeling  requires 
progress  in  ontology*  development  and  causal  process  descriptions. 

4.2  Prediction 

The  power  of  experts  in  prediction  comes,  not  from  wholesale  formalization  of  the  problem 
in  terms  of  Physics  and  subsequent  qualitative  or  other  simulation  (as  much  of  current  QP 
work  tends  to  present  the  problem),  but  by  the  use  of  a  substantial  body  of  compiled  causal 
knowledge  in  the  form  of  causal  process  descriptions  to  hypothesize  states  of  potential  inter¬ 
est.  Further,  the  state  variables  participating  in  causal  relations  may  not  all  be  continuous, 
and  hence,  even  in  principle,  not  all  problems  of  prediction  can  be  formulated  as  analysis 
of  dynamical  systems.  For  example,  a  substantial  part  of  our  causal  knowledge  is  about 
nominal  variables  ('‘vacations  relax  people,”  “lack  of  support  causes  objects  to  fall”).  Simon 
[Simon,  1991]  describes  a  causal  ordering  scheme  which  works  with  such  variables,  but,  as 
a  rule,  the  SPQR  models  and  the  dynamic  system  analysis  techniques  work  only  with  state 
variables  which  are  continuous. 

Humans,  in  their  everyday  life,  rarely  predict  behavior  in  the  physical  world  by  gener¬ 
ating  a  long  series  of  causal  chains,  certainly  not  a  series  of  inferences  that  can  be  called 
“sound.”  The  reasons  for  this  are  brought  out  clearly  by  QP  work:  ambiguities  proliferate 
rapidly.  If  you  ask  someone  what  will  happen  if  a  ball  is  thrown  at  a  wall,  very  little  of  the 
sequence  of  predictions  is  the  result  of  application  of  scientific  laws  of  motion.  Rather,  a 
short  series  of  causal  sequences  are  constructed  from  compiled  causal  knowledge,  instanti¬ 
ated  to  the  specific  physical  situation.  Two  important  sources  of  power  that  are  available 
for  human  experts  in  generating  successor  states  and  handling  ambiguities  are  discussed 
next. 

Compilation  of  consaqucncos 

If  we  ask  someone,  “what  will  h^pen  if  I  throw  a  rock  at  the  giaas  window?”  that  person 
is  likely  to  say,  “the  window  might  break.”  A  number  of  such  causal  fragments,  compiled 
from  experience  or  from  ea^er  problem  solving  episodes,  arc  stored  as  part  of  oar  causal 
knowledge  about  domains  of  interest.  The  ambiguity  (“might”)  is  OK,  since  the  goal  of 
qualitative  prediction  is  typically  not  accuracy  or  certainty,  but  identification  of  an  inter¬ 
esting  possibility  that  may  be  investigated  more  thoroughly  if  needed. 

Handling  ambiguity 

Ambiguities  in  causal  simulation  are  often  handled  not  on  the  basis  of  what  effect  will 
happen,  but  on  the  basis  oi  what  might  happen  that  may  help  or  hart  the  explicit  or 
ba^ground  goals  of  the  problem  solver.  Thus,  when  there  is  mors  than  one  successor  state 
in  simulation,  the  state  that  is  rriated  to  goals  of  interest  is  chosen.  In  the  exam^  involving 
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the  glass  window,  suppose  a  person  was  standing  on  the  other  side  of  the  glass  window,  and 
you  saw  some  one  about  to  throw  a  rock  at  the  windii^v.  You  would  most  likely  attempt 
either  to  stop  the  rock  throwing  or  alert  the  person  standing  at  the  window.  You  would  not 
be  paralyzed  with  the  ambiguities  in  prediction:  the  rock  may  not  really  hit  the  window, 
the  window  may  not  shatter,  the  rock  may  miss  the  person,  the  rock  or  glass  fragments  may 
not  draw  blood,  and  so  on.  Not  only  the  commonsense  world,  but  engineering  reasoning  is 
also  full  of  such  goal-driven  ambiguity  handling.  For  example,  in  design  analysis,  one  might 
use  this  form  of  ambiguity  handling  to  identify  the  possibility  that  a  component  will  make 
its  way  into  a  dangerous  state.  Of  course,  once  this  possibility  is  identified,  quantitative 
or  other  normative  methods  can  be  used  in  a  selective  way  to  verify  the  hypothesis.  This 
heuristic  role  of  quaditative  behavior  prediction  is  often  ignored  in  QP  research  in  favor  of 
concerns  about  completeness  and  soundness  of  predictions. 

There  is  often  no  reason  to  make  a  complete  and  ambiguity-laden  prediction  of  a  physical 
situation,  if  small  and  possibly  retractable  changes  can  be  made  to  the  physical  world,  and 
changes  directly  noted.  In  fact,  this  is  just  another  instance  of  using  the  real  world  as  a 
computational  aid  [Chapman,  1990],  and  avoiding  long  chains  of  reasoning  based  on  complex 
symbolic  reasoning  modris. 

In  engineering  and  scientific  reasoning,  whenever  reasoning  about  consequences  reaches 
a  point  where  relatively  precise  answers  are  needed  for  choices  to  be  made  —  and  only  then 
—  the  situation  can  be  selectively  modeled  and  analytical  methods  of  varying  degrees  of 
complexity  and  precision  can  be  employed.  The  modeb  that  are  formed  reflect  the  problem 
solving  goal  that  is  current,  and  typically  represent  only  a  small  slice  of  the  physical  system. 

4.3  Control 

For  control  of  reasoning  (or  the  sdection  and  deployment  of  reasoning  methods,  the  so- 
called  problem  of  “intdligent  control  of  methods”),  we  will  need  a  theory  of  task  analysis 
for  various  tasks  and  domains.  Such  a  task  analjrsis  will  delineate  alternative  methods  for 
a  problem,  and  the  subtasks  that  each  method  generate.  The  properties  and  knowledge 
requirements  for  each  method  could  also  be  identified  as  part  of  such  a  task  analysis,  so 
that  choices  of  methods  are  done  with  due  respect  to  the  needs  of  goals  and  knowledge 
availability.  For  example,  a  decision  might  be  made  about  whether  the  current  goal  can  ^ 
met  by  using  stored  causal  proceu  descriptions,  by  obtaining  more  accurate  information 
from  the  real  world,  or  by  pcrCmming  calculations.  An  engineer  might  be  able  to  use  order- 
of-magnitude  reasoning  in  some  cases  to  decide  that  safe  levels  of  current  will  obtain  in  a 
circuit.  In  otlmr  cases,  predictive  reasoning  (with  some  type  of  ambiguity  resdution)  might 
simply  identic  nnsale  current  levti  as  a  possibility.  In  the  latter  case,  a  detailed  formulation 
of  KirchoiTs  laws  for  the  televant  subsystem  can  be  made  and  the  equations  solved.  Thus, 
an  item  in  the  research  agenda  for  any  QP  of  the  future  will  need  to  be  the  devebpment  of 
task  structures  for  various  tasks  involved  in  QP.  An  example  of  such  a  task  analj^  is  the 
task  structure  for  design  is  given  in  [Chandrasekaraa,  1990). 


5  Summary 

The  QP  community  represents  many  diverse  goals.  Even  if  one  were  not  especially  interested 
in  cognitive  modeling  and  only  cared  about  producing  powerful  technologies  to  help  in 
reasoning  about  physical  devices.  QP  can  exploit  representations  and  techniques  that  form 
part  of  human  expertise. 

The  thrust  of  my  discussion  has  been  that  if  we  are  going  to  be  doing  a  new  QP,  we 
might  as  well  expand  its  scope  to  include  these  different  sources  of  power  in  expert  reasoning. 
While  I  agree  with  the  points  raised  by  S&D  regarding  the  deficiencies  of  the  current  QP 
calculi  and  also  with  their  argument  for  the  importance  and  power  of  qualitative  analysis 
of  dynamical  systems  using  advanced  mathematical  knowledge,  the  general  problem  of  QP 
is  larger  that  that  of  reasoning  about  systems  for  which  Physics  models  can  be  readily 
made  and  which  can  be  cast  as  dynamical  systems  of  certain  types.  Many  QP  researchers 
have  recognized  this  and  have  been  hard  at  work  on  developing  the  ontologies  and  process 
descriptions  needed  for  the  modding  task.  However,  much  QP  research  in  modeling  and 
reasoning  is  too  concerned  with  step-by-step  soundness  at  the  expense  of  the  heuristic 
power  of  much  human  reasoning.  It  misses  the  role  played  by  the  “situatedness"  of  human 
reasoning  about  the  physical  world:  the  background  goals  and  the  rich  store  of  causal 
knowledge  already  possessed.  It  is  not  that  one  wants  the  results  to  be  incorrect,  but 
soundness  of  human  causal  reasoning  is  the  result  of  interesting  and  effective  hypotheses 
made  in  the  first  place,  followed,  if  necessary,  by  a  focused  application  of  verification  or 
refinement  techniques. 

Finally,  we  need  to  expand  our  consideration  of  media  for  representation  and  manipula¬ 
tion  of  concepts;  for  example,  reasoning  with  pictorial  representations  for  problems  involving 
shapes  seems  to  be  a  natural  direction  of  research. 
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We  propose  as  a  working  bypotfeesis  a  Separability 
Mypotb^  wbicb  posits  ttet  one  can  factor  off  an 
ar^tectnre  for  cognition  from  a  more  generai  architec¬ 
ture  for  mind,  thus  avoiding  a  number  of  philosoptncal 
objections  that  have  been  raised  about  the  ‘strong  AF 
hypothesis.  Using  a  coin-sorting  mncbme  as  an  example, 
we  discuss  a  range  of  positioas  on  representations  and 
argue  that,  for  many  purposes,  the  same  body  of  matter 
can  be  interpreted  as  bMring  different  representational 
formalisms.  We  then  propose  that  one  way  to  imderstand 
the  diversity  of  arcMtectnrai  theories  is  to  make  a 
distinction  between  deliberative  and  snbdeliberative 
architectures.  The  search  for  one  architectural  level 
which  win  explain  aU  the  interesting  phenomena  of 
cognition  is  likely  to  be  futile.  There  are  a  number  of 
levels  that  interact,  and  this  interaction  makes  explana¬ 
tion  IB  terms  of  one  level  qynle  incomplete. 


Dimeosioiis  for  thinking  about  thinking 

A  major  problem  in  the  study  of  intelligence  and 
cognition  is  the  range  of — often  implicit — assump¬ 
tions  about  what  phenomena  these  terms  are  meant  to 
cover.  Are  we  just  talking  about  cognition  as  having 
and  using  knowledge,  or  are  we  also  talking  about 
other  mental  states  such  as  emotions  and  subjective 
awareness?  Are  we  talking  about  intelligence  as  an 
abstract  set  of  capacities,  or  as  a  set  of  biological 
phenomena?  These  two  questions  set  up  two  dimensions 
of  discussion  about  intelligence.  After  we  discuss  these 
dimensions  we  will  discuss  information  processing, 
represenution,  and  cognitive  architectures. 

Dunension  I.  Is  intelUgence  seporahfe  from  other 
mental  phenomena? 

When  people  think  of  intelligenoe  and  cognition,  they 
often  think  of  an  agent  being  in  some  knowledge  state, 
that  is,  having  thoughts,  betie&  They  also  think  of  the 
underlying  process  of  cognition  as  something  that 
changes  knowledge  states.  Since  knowledge  states  are 
particular  types  of  information  stales  the  underlying 


process  is  thought  of  as  information  processing.  (We 
will  discuss  this  in  more  detail  later  in  the  paper.) 
However,  besides  th  e  knowledge  states,  mental 
phenomena  also  include  such  things  as  emotional  states 
and  subjective  consciousness.  Under  what  conditions 
can  these  other  mental  properties  also  be  attributed  to 
artifacts  to  which  we  attribute  knowledge  states?  Is 
intelligence  separable  from  these  other  mental  pheno¬ 
mena? 

It  is  possible  that  intelligence  can  be  explained  or 
simulated  without  necessarily  explaining  or  simulating 
other  aspects  of  mind.  A  somewhat  formal  way  of 
putting  this  Separability  Hypothesis  is  that  the  know¬ 
ledge  state  transformation  account  can  be  factored  off 
as  a  homomorphism  of  the  mental  process  account. 
That  is;  If  the  mental  process  can  be  seen  as  a  sequence 
of  transformations:  where  Mj  is  the 

complete  mental  state,  and  the  transformation  function 
(the  function  that  is  responsible  for  state  changes)  F, 
then  a  subprocess  K,-*K2->  can  be  identified  such 
that  each  K|  is  a  knowledge  state  and  a  component  of 
the  corresponding  Mj ,  the  transformation  function  is  /, 
and  /  is  some  kind  of  homomorphism  of  F.  A  study  of 
intelligence  alone  can  restrict  itself  to  a  characterization 
of  K 's  and  /,  without  producing  accounts  of  M's  and  F. 
If  cognition  is  in  fact  separable  in  this  sense,  we  can  in 
principle  design  machines  that  implement  /  and  whose 
states  are  interpreuble  as  K\  We  can  call  such 
machines  cognitive  agents,  and  attribute  intelligenoe  to 
them  if  they  achieve  goals.  However,  the  sutes  sudi 
machines  are  not  necessarily  intopretaUe  as  complete 
M's,  and  thus  they  may  be  denied  other  attributes  of 
mental  states. 

For  example,  Searle'  holds  that  a  computer 
program  that  successfully  translates  from  Chinese  to 
English  cannot  be  said  to  'understand  Chinese',  even 
though  it  is  behaviorally  intelligent  in  this  task.  In  our 
terminology,  wc  would  attribute  to  the  program  various 
appropriate  knowledge  states.  Searte's  objection  can  be 
formulated  as  the  claim  that  'undemanding'  is  a 
subjective  pnvperty  that  goes  beyond  merely  being  in 
tlw  corresponding  knowledge  state,  and  thus  the 
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program  can  be  denied  that  attribute. 

However,  other  researchers  claim  that  intelligence 
cannot  be  separated  from  other  mental  phenomena. 
Such  a  claim  is  often  made  from  two  opposite 
perspectives.  Most  people  in  artificial  intelligence  (AI) 
and  cognitive  science  say  that  intelligence  and  other 
aspects  of  mind  are  inseparable  because  the  other 
mental  aspects  (subjectivity,  emotional  states,  etc)  are 
simply  ‘emergent’  properties  of  certain  kinds  of  complex 
agents  with  knowledge  states.  If  this  is  the  case,  the 
knowledge  state  account,  and  with  it  an  account  in 
terms  of  information  processing,  will  be  a  sufficient 
basis  for  explaining  and  building  minds.  From  this 
perspective,  explanation  of  the  phenomena  of  intelli¬ 
gence  and  cognition  will  also  turn  out  to  be  explana¬ 
tion  of  the  full  range  of  mental  phenomena.  By  the 
same  token,  it  is  assumed  that  artificial  agents  that  can 
be  plausibly  interpreted  as  solving  problems,  achieving 
goals,  and  performing  reasoning  will  also  have 
emotional  states  and  subjective  consciousness  attribu¬ 
table  to  them. 

The  second  perspective  from  which  intelligence  is 
taken  to  be  inseparable  from  other  mental  phenomena 
holds  that  there  is  no  coherent  way  to  factor  off  a 
knowledge  state  process  account  from  a  mental  state 
process  account.  There  is  only  one  mental  process.  That 
is,  from  this  point  of  view,  the  categorical  difference 
between  different  attributes  of  mental  states  is  affirmed, 
but  the  Separability  Hypothesis  is  denied.  We  can  talk 
about  knowledge  components  of  menul  states,  but 
mental  processes  have  no  ‘subprocesses’  which  only 
have  to  do  with  information  processing.  In  this  view, 
the  only  way  to  explain  or  build  an  intelligence  is  to 
solve  the  problem  of  explaining  or  building  a  mind. 
Thus  only  agents  which  have  the  totality  of  what  we 
call  ‘mind’  will  be  able  to  perform  as  truly  successful 
problem  solvers  across  the  whole  range  of  situations 
deemed  to  require  intelligence. 

Edelman^'^  has  argued  that  information  processing  is 
not  the  appropriate  way  to  talk  about  cognition. 
Instead  he  proposes  that  the  basic  mechannms  of  the 
brain  are  the  selection  of  successful  neural  pathways  in 
response  to  interactions  with  the  world.  The  processes 
that  underlie  this  neuronal  path  selection  resemble 
Darwinian  evolutionary  processes.  Cognitive  pheno¬ 
mena,  in  his  view,  cannot  be  separated  and  understood 
in  information  processing  terms,  since  cognitive  states 
arc  simply  aspects  of  more  general  brain  states,  and  the 
bask  brain  mechanisms  are  not  information  processes. 

Dimension  2 ;  Functional  versus  biological 

The  second  dimension  in  discussions  about  intelligence 
involves  the  extent  to  which  we  need  to  be  tied  to 
bioiofy  for  understanding  intdligenoe.  Can  intelligence 
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be  characterized  abstractly  as  a  functional  capability 
which  just  happens  to  be  realized  more  or  less  well  by 
some  biological  organisms?  If  it  can,  then  study  of 
biological  brains  or  of  human  psychology  is  not 
logically  necessary  for  a  theory  of  cognition  and 
intelligence,  just  as  enquiries  into  the  relevant  capabilities 
of  biological  organisms  are  not  needed  for  the  abstract 
study  of  logic  and  arithmetic  or  for  the  theory  of  flight. 
Of  course,  we  may  learn  something  from  biology  about 
how  to  practically  implement  intelligent  systems,  but  we 
may  feel  quite  free  to  substitute  non-biological  (both  in 
the  sense  of  architectures  which  are  not  brain-like  and 
in  the  sense  of  not  being  constrained  by  considerations  of 
human  psychology)  approaches  for  all  or  part  of  our 
implementation.  Whether  intelligence  can  be  charact- 
eri^  abstractly  as  a  functional  capability  surely 
depends  upon  what  phenomena  we  want  to  include  in 
ddtning  the  functional  capability,  as  we  discussed.  We 
might  have  different  constraints  on  a  definition  that 
nettled  to  include  emotion  and  subjective  states  from 
one  that  only  included  knowledge  states.  Qearly,  the 
enterprise  of  AI  deeply  depends  upon  this  functional 
view  being  true  at  some  level,  but  whether  that  level  is 
abstract  logical  representations  as  in  some  branches  of 
AI,  Darwinian  neural  pathway  selections  as  proposed 
by  Edelman,  something  intermediate,  or  something 
physicalbt  is  still  an  open  question. 

Newell  holds  a  functional  view  of  intelligence. 
According  to  Neweil^  intelligent  agents  can  be 
abstractly  characterized  by  their  goals,  their  knowledge 
and  the  Principle  of  Rationality.  That  is,  when  we 
attribute  intelligence  to  an  agent  in  some  behavior,  we 
are  attributing  to  that  agent  a  goal,  a  body  of  know¬ 
ledge,  and  a  capability,  at  least  in  that  instance  of 
behavior,  of  applying  knowledge  relevant  to  the  goal  to 
decide  what  to  do.  It  is  important  to  note  that  all  of 
this  is  attribution.  Newell  calls  a  description  of  an  agent 
in  these  terms  a  Knowledge  Level  description.  Know¬ 
ledge  Level  descriptions  view  the  agent  as  being  in  a 
knowledge  state,  and  the  Principle  of  Rationality  as  the 
abstract  characterization  of  how  the  agent  changes 
knowledge  states.  (Attributing  knowledge  and  goals  to 
an  agent  is  similar  to  taking  an  intentional  stance 
towards  agents  that  [>ennett’  has  proposed.) 

There  is  no  claim  that  knowledge  is  internally 
represented  explicitly,  and  in  just  the  same  propositional 
units  as  in  the  attribution,  or  that  expiidtly  inferential 
processes  are  operating.  Newell  defines  the  functionality 
of  intelligence  as  the  ability  of  an  agent  to  realize  the 
knowledge  potential  inhermt  in  its  Knowledge  Level 
desoiption.  For  Newell  the  important  character  of 
intelligenoe  is  the  agent’s  ability  to  make  full  use  of  the 
knowledge  attributed  to  it,  not  the  amount  or  the 
Htedfics  of  the  agent’s  knowledge.  Even  humans  are 
only  an  approximation  to  the  ideal  intelligence  so 
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denned.  In  this  perspective,  biological  evolution  will  be 
seen  as  operating  in  the  direction  of  better  and  better 
approximation  to  this  sort  of  intelligence  through  the 
evolution  of  more  complex  knowledge  state  representa¬ 
tions  (of  the  sort  that  finds  its  culmination  in  human 
language)  which  are  capable  of  supporting  open-ended 
deliberation  and  the  application  of  knowledge  to  new 
goals. 

So,  with  Newell,  we  have  a  functional  characteriza¬ 
tion  of  intelligence  which  is  independent  of  biology.  But 
Newell  goes  on  to  propose  an  architecture  which  is 
inspired  by  one  biological  instantiation,  the  human 
cognitive  apparatus.  This  architecture  is  simitar  to  the 
human  one  in  that  it  has  a  long-term  memory  and  a 
deliberative  architecture  similar  to  the  one  that  in  his 
view  characterizes  human  cognition.  But,  because  it  is  a 
functional  architecture,  it  goes  beyond  the  biological  in 
many  ways.  For  example,  the  ideal  architecture  always 
retrieves  the  relevant  knowledge,  unlike  the  human 
version  which  often  fails  to  remember.  Further,  the 
functional  architecture  is  based  on  digital  computer-like 
symbol  structures.  For  Newell,  it  does  not  matter  if  the 
human  brain  is  literally  such  a  computer.  All  that 
matters  is  that  the  kind  of  computer-like  symbol 
structures  can  support  the  functionality  needed.  Further, 
the  architecture  that  is  proposed  by  Newell  as  a 
possible  one  for  A1  is  just  one  among  many  possible 
realizations  of  the  abstract  functional  capability  speci¬ 
fied  in  his  definition  of  intelligence. 

In  general  functional  charaaerizations  end  up  using 
aspects  from  very  different  levels  of  descriptions  of 
biological  mind.  For  example,  the  conncctionists  want 
to  be  biological  enough  to  include  some  of  the  smooth 
concept  learning  done  by  humans,  and  an  architecture 
based  on  some  abstract  properties  of  what  they  take  to 
be  the  information  processing  of  brains,  but  their 
orientation  is  not  so  biological  as  to  demand  wet 
neurons  and  neuronal  chemistry.  Searle  wants  to  be 
biological  enough  to  demand  the  inclusion  of  the 
subjective  awareness  of  being  in  a  knowledge  state 
(which  is  how  we  interpret  his  claim  that  a  translator 
who  follows  the  algorithm  does  not  really  'understand 
Chinese')  that  humans  have,  but  he  thinks  that  it  is 
most  likely  the  chemistry  of  the  brain  that  is 
responsible  for  it,  and  thus  a  pure  information 
processing  account  will  not  succeed.  Edelman  wants  to 
be  biological  enough  to  include  the  way  in  which 
organisms'  brains,  in  his  view,  do  not  use  pre-made 
internal  labels  (which  he  takes  to  be  the  characteristic 
property  of  information  processing).  Since  his  theory  of 
pathway  selection  itself  is  stated  as  an  abstract 
mechanism,  presumably  artifacts  couM  be  constructed 
which  implement  that  abstract  architecture  without  any 
further  reference  to  biology.  Conncctionists  (Rumelhart 
et  al.^)  and  Edelman  want  to  be  biological  enough  to 


understand  the  common  heritage  between  animals  and 
humans,  while  traditional  AI  researchers  stop  their 
biological  commitment  to  characterizing  intelligence  as 
using  knowledge  to  reason  and  achieve  goals  (since  they 
take  humans  to  be  doing  that).  Thus,  all  such  proposals 
pick  out  some  interesting  aspect  from  biological 
phenomena.  They  then  proceed  to  formulate  a 
functional  model  that  includes  the  selected  aspect  After 
this,  real  biology  is  no  longer  logically  necessary. 
Whether  any  of  these  proposals  would  lead  to  the  pro¬ 
duction  (or  explanation)  of  mentality  in  total,  or  almost 
circularly,  produce  only  those  aspects  of  mentality  that 
are  indud^  in  the  functional  definition,  is  obviously  an 
open  question. 

Coin-sorters  and  knowledge  sutes 

In  this  article  we  will  take  the  Separability  Hypothesis 
as  a  working  hypothesis.  At  this  poinL  for  all  practical 
purposes  A I  (and  cognitive  science)  can  be  considered 
the  study  of  those  regularities  of  mind  that  have 
iifonnation-processing  explanations.  We  will  assume 
that  it  is  a  worthwhile  enterprise  to  concentrate  on 
phenomena  in  which  knowledge  states  of  the  agent 
seem  to  play  the  central  role.  Further  we  will  focus  on 
processes  that  account  only  for  generation  and 
transformation  of  such  knowledge  states.  Now  this 
might  appear  to  be  a  commitment  to  information 
processing  so  strong  that  many  interesting  theories  will 
be  ruled  out  However,  we  will  argue  that  the 
knowledge  state  account  is  very  flexible,  and  can  even 
be  applied  to  situations  where  there  is  no  explicit 
information  processing  in  the  conventional  sense.  To 
illustrate  this  we  will  use  the  example  of  a  coin-sorter 
for  coins  of  USA. 

Analysis  of  a  coin-sorter 

Let  us  suppose  that  we  have  a  black  box  coin  sorter  in 
front  of  us,  and  we  want  to  describe  its  behavior 
computationally.  All  we  see  is  that  the  coins  are  put 
into  the  top  of  the  coin  sorter,  and  then  they  come  out 
through  one  of  four  slots  at  the  bottom,  with  all  the 
dimes  coming  out  of  the  slot  designated  the  dime  slot, 
and  the  quarters  coming  out  of  the  slot  Agdgnargrf  the 
quarter  slot,  and  so  on.  Let  us  assume  we  have  four 
t)rpes  of  AI  theorists;  a  logician,  someone  who  is 
committed  to  algorithms  alone  as  the  langiiay  in 
which  to  formulate  A I  theories,  a  connectionist  and  a 
physicalist,  i.e.  one  who  claims  that  the  an>ropriate 
explanation  of  the  coin-sorter  should  be  in  terms  of  iu 
physics,  not  representations. 

Logic  system  coin  sorter.  The  logician  proposes  that 
the  machine's  behavior  can  be  understood  in  terms  of 
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four  logical  axioms,  one  for  each  coin.  A  set  of 
measurements  is  made  on  each  of  the  coins. 
Perhaps  diameter,  weight  and  thickness  are  the  coin’s 
important  features  for  this  purpose.  Each  coin  type  is 
characterized  by  a  logicd  formula  of  predicates 
involving  the  measurements.  For  example,  the  axioms 
for  each  of  the  four  types  of  coins  will  indicate  what 
combination  of  weight,  thickness  and  diameter  chara¬ 
cterize  that  coin  type.  The  logician  claims  that  the 
behavior  of  the  machine  can  then  be  characterized  by  a 
theorem-proving  decision  procedure  that  attempts  to 
prove  each  of  the  theorems  for  each  coin  that  is 
inserted,  followed  by  a  mechanism  that  places  the  coin 
into  that  slot  corresponding  to  the  theorem  that  was 
proved. 

Note  that  this  language  enables  us  to  argue  about 
different  theories  about  what  is  being  measured  by  the 
sorter.  Someone  could  watch  the  behavior  of  the  coin 
sorter  and  assert  that  the  machine  is  not  using  informa¬ 
tion  about  the  weight  and  diameter  of  the  coins,  but 
rather  about  say,  its  color  and  metallic  content  They 
could  propose  an  alternative  axiom  system  in  terms  of 
color  and  metallic  content.  Each  such  axiom  system  is  a 
different  content  theory  expressed  in  the  logic  forma¬ 
lism. 

Further,  the  formalism  can  be  used  to  evaluate  these 
alternate  theories  and  test  them  experimentally.  We  can 
use  logical  inference  to  draw  out  the  consequences  of 
each  proposal.  One  hypothesized  content  theory  might 
predict  that  a  given  foreign  coin,  say  an  Indian  rupee, 
will  come  out  of  the  quarter  slot,  while  another  might 
predict  that  the  rupee  will  come  out  of  the  penny  slot 
We  can  then  test  to  see  which  hypothesize  content 
theory  most  accurately  describes  the  decision-making 
process  within  the  black  box  by  putting  the  rupee  in 
and  seeing  which  slot  it  is  placed  at 

Notice  that  the  usefulness  of  the  logic  formalism  has 
two  levels.  On  one  level,  we  can  use  the  formalism  to 
describe  different  content  theories,  e.g.  the  theory  that 
the  coins  are  being  sorted  by  color  versus  one  that  they 
are  being  sorted  by  weight  We  can  use  the  inference 
machinery  that  comes  with  logic  to  derive  consequences 
of  different  axioms  and  test  one  theory  of  representa¬ 
tional  content  against  another.  For  this  purpose,  there 
is  no  need  to  commit  oneself  to  how  the  insides  of  the 
sorter  work  in  any  detail,  except  that  informatioo  of 
certain  types  is  being  used  to  make  dedsioas  of  certain 
types.  We  are  simply  using  logic  to  reason  about  the 
agent,  much  as  it  is  used  in  computer  science  to  reason 
about  the  correctness  of  a  computer  program  written  in 
some  other  language  than  logic.  We  are  using  logic  to 
give  a  Knowledge  Level  description  of  the  system. 

The  second  use  of  logic  may  be  to  modd,  or  carry 
out,  internal  processing.  For  example,  the  coin-sorter 
might  actually  have  dedicated  Prolog  chips  inside 
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actually  implementing  the  theorem  provers.  The  coin¬ 
sorter  might  literally  work  by  actuating  an  arm  that 
places  the  coins  in  the  slots  as  soon  as  the  results  of  the 
theorem  provers  are  available. 

Decision  tree  coin-sorter.  The  second  theorist  observes 
the  coin-sorter  and  announces  that  its  behavior  can  be 
described  by  a  decision  tree.  In  a  decision  tree  machine, 
there  is  an  initial  decision  made  between  two  groups, 
e.g.  between  the  group  consisting  of  the  nkkel  and  the 
quarter,  and  the  group  consisting  of  the  dime  and  the 
penny.  For  each  of  the  groups,  at  the  next  point  in  the 
tree,  an  additional  decision  is  made  to  make  a  choice 
among  subgroups,  and  this  is  repeated  until  each  leaf 
node  correspond  to  one  of  the  elements  of  the  original 
group.  We  now  have  a  decision  tree.  In  the  coin-sorter 
example,  we  would  only  need  two  levels  in  the  tree.  The 
criteria  for  the  decisions  at  each  node  are  given  in  the 
form  of  rules  involving  values  of  measurements  made 
on  the  coin. 

Again,  we  can  use  the  formalism  as  a  descriptive 
device,  or  as  a  commitment  to  a  certain  internal 
processing.  For  example,  as  a  descriptive  device,  the 
decision  tree  still  enables  us  to  propose  different  content 
theories,  not  only  about  what  aspects  of  the  coins  are 
measured  as  in  the  logic  case,  but  also  about  what  sets 
of  decisions  are  made  before  what  decisions.  In  this 
sense,  what  was  left  as  a  feature  of  internal  processing 
in  the  use  of  logic  for  external  description,  namely  some 
aspect  of  control  strategy,  is  actually  now  made  part  of 
the  external  description  of  the  device.  The  axiom  system 
made  no  conunitment  to  control  This  expresses  the 
difference  between  a  Knowledge  Level  account  and  a 
program  level  account 

On  the  other  hand,  similar  to  the  logic  case,  one  can 
imagine  microprocessors  actually  implementing  the 
decision  tree  algorithm,  using  the  measurements  to 
make  the  choices  in  the  tree,  and  activating  the  coin¬ 
placing  mechanism  appropriately  when  a  leaf  node  is 
reached. 

Connectionist  network  coin-sorter.  The  connectionist 
claims  that  what  is  really  going  on  in  the  coin  sorter 
involves  the  same  features,  diameter,  color,  or  whatever, 
as  the  other  theories  assumed,  but  these  evidenoes  are 
‘weighted’  and  combined  as  in  a  connectionist  network. 
Diffsrent  theories  of  representational  content  could  still 
be  represented  by  identifying  the  nodes  with  diftrent 
types  of  ineasuienienL  How  the  infbnnation  is  used  can 
be  described  by  means  of  different  we^ts  and 
thresholds  in  the  network.  Infermediate  abstractions 
may  be  captured  by  hidden  units.  The  mtermediate 
abstractions  are  combined  with  other  intermediate 
abstractions  and  again  weighted  and  higher  ievd 
decision  units  are  constructed.  A  specific  output  node  is 
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identified  for  each  coin.  The  ‘energy'  at  the  output 
nodes  will  be  a  function  of  how  much  evidence  is 
coming  through  for  the  coin  for  which  it  stands.  The 
output  node  corresponding  to  the  largest  activation  will 
be  chosen  as  the  decision  node. 

Pretty  much  all  the  points  we  made  about  logic  and 
decision  trees  can  be  repeated  for  this  account  as  well. 
The  connectionist  framework  can  be  used  to  describe 
content  theories  about  what  information  is  used,  and  to 
give  an  account  of  what  evidence  is  combined  in  what 
proportion  with  what  other  evidence.  Inferences  about 
different  content  theories  can  be  made  and  tested.  At 
this  level,  no  commitment  needs  to  be  made  that  the 
inside  of  the  sorter  is  literally  a  connectionist  machine. 
On  the  other  hand,  the  connectionist  network  can  be 
used  as  the  internal  information  processor  as  well. 

yoUa  :  Levers  and  holes!  Let  us  now  open  the  coin* 
sorter  and  look  at  its  inside.  We  see  that  as  you  put  a 
coin  in,  it  passes  through  levers  and  holes,  all  cleverly 
arranged  such  that  the  coin  makes  its  way  to  the  right 
slots.  Gearly,  the  different  weights  and  the  sizes  of  the 
coin  have  different  effects  on  the  levers  and  the  holes. 
There  are  no  prologue  chips  or  microprocessors  or 
connectionist  networks  inside  the  black  box,  just 
mechanical  parts.  The  physicalist,  the  one  who  does  not 
believe  in  representations,  smiles. 

Does  the  sorter  have  a  knowledge  state  interpretatUm? 

In  response  to  the  question,  ‘How  did  the  quarter  end 
up  in  the  slot  named  “quarterT,  two  kinds  of  answers, 
both  correct,  can  be  given.  In  one,  the  answer  would  be 
physicalist;  an  account  of  the  coin's  movement  through 
the  inside  of  the  sorter  following  the  physical  laws.  In 
the  other,  the  answer  would  be  in  terms  of  how  the 
leven  and  holes  ‘use’  information  about  the  diameter 
and  the  weight  and  how  the  sorter  ‘decides’  about  the 
coin’s  direction  of  movetnenL  Gearly,  whoever  designed  it 
designed  the  sorter  in  such  a  way  that  there  is  a  close 
mapping  between  the  information  story  and  the 
physical  story.  Because  of  this  mapping,  one  can  talk 
about  the  sorter  being  in  various  knowledge  states.  Of 
course,  if  the  sorter  that  works  by  levers  and  holes  has 
a  consistent  interpreution  in  terms  of  knowledge  sutes, 
then  certainly  any  sorter  that  actually  had  a  chip 
proving  theorems  or  implementing  the  decision  tree 
algorithm  or  the  connectionist  network  will  also  have  a 
similar  interpretation.  That  is,  the  knowledge  state  and 
information  processing  talk  is  applicable  to  all  devices 
whose  behavior  has  a  decision*making  interpretation, 
irrespective  of  how  they  actually  work. 

We  can  see  that  the  logic  account,  the  decision  tree 
algorithm  account  and  the  connectionist  account  are  all 
alternative  languages  in  which  to  couch  the  information 


processing  account.  While  all  three  frameworks  can  be 
used  to  describe  information  representation  and 
processing,  they  are  not  all  equivalent  Connectionism 
enables  one  to  talk  about  ‘softer’  combination  of 
information  using  real  numbers,  while  logic  enables  us 
to  talk  about  variables  and  quantification,  and  the 
language  of  algorithms  enables  us  to  ulk  about  control 
strategies.  However,  our  main  point  here  is  that  they 
can  all  be  used  as  frameworks  for  describing  informa¬ 
tion  representation  and  processing,  and  also  for 
implementing  information  processing.  In  Newell's 
language,  they  can  be  used  both  as  languages  for  the 
Knowledge  Level  and  for  the  Symbol  Level. 

The  coin-sorter  is  a  simple  device,  but  it  illustrates 
the  issues  with  respect  to  understanding  biological 
brains.  People  take  a  whole  range  of  stances  on 
whether  the  brain  is  actually  doing  information 
processing  on  representations.  Strong  materialists  argue 
that  representationalist  accounts  of  such  systems  are 
wrong,  and  the  only  scientifically  acceptable  causal 
story  is  at  the  level  of  the  matter  that  composes  the 
brain.  Edelman  is  also  against  the  information 
processing  account,  but  his  counter-proposal  is  in  terms 
of  an  abstract  pathway  selection  accounL  which  is  still 
an  abstract  functional  architecture  (i.e.  no  appeal  to 
physical  laws  is  made),  though  not  an  information 
processing  one.  Among  those  who  agree  that  there  is  a 
causal  story  to  be  told  at  the  level  of  representations, 
there  are  many  divisions,  but  broadly,  we  can 
distinguish  between  connectionist  style  representations 
and  discrete  symbol  structure  representations.  The 
moral  of  our  analysis  of  the  coin-sorter  is  that  for 
explaining  behavior  which  itself  is  couched  in  informa¬ 
tional  term'll  the  information  processing  account  is 
useful  as  a  stance  to  describe  the  biological  brain. 

Much  of  the  argument  in  the  field  is  a  result  of  a 
confusion  between  two  senses  of  being  an  information 
processor  using  representations.  In  one  sense,  when  we 
ask  whether  the  brain  processes  information  we  are 
really  asking  whether  it  is  appropriate  to  ascribe 
informational  activity  to  the  brain  and  in  the  other 
sense  we  are  literally  describing  what  the  brain  or 
device  actually  does.  Ascribing  information  processing 
is  to  take  an  information  processing  stance.  For 
example  we  might  ascribe  information  processing 
activity  to  the  visual  system  on  the  grounds  that  it 
produces  information  about  the  world.  This  is  the  sense 
of  information-processing  we  are  using  when  we  stand 
outside  the  brain  and  look  at  behavior  and  ascribe  an 
information-processing  structure  to  the  behavior  that 
we  see.  When  we  look  at  a  black  box  coin  sorter  as  a 
decision  maker  and  work  out  a  model  of  its 
input/output  behavior,  we  are  ascribing  information 
processing  to  it. 

However,  taking  an  informational  sunoe  whereby  we 
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ascribe  mformation  processing  to  a  device  (or  brain) 
does  not  commit  us  to  that  device  literally  processing 
information,  or  using  representations,  in  the  specific 
medium  in  which  the  de^ption  is  made.  There  is  a 
fact  of  the  matter  about  whether  the  information 
processing  is  being  done  in  one  medium  or  another.  At 
some  point  the  behavior  of  the  sorter  which  employs  a 
Prolog  theorem  prover  will  be  dfflerent  from  that  based 
on  levers  and  holes.  When  the  latter  sorter  mal¬ 
functions,  the  explanation  may  be  given  in  terms  of 
physical  properties,  such  as  a  lever  being  jammed,  while 
in  the  case  cf  the  former  type  of  sorter,  the  expianation 
might  be  in  terms  of  an  error  in  the  program  in  the  chip 
or  some  hardware  failure  in  the  diip.  (And  in  the  case 
of  the  brain,  in  addition  to  the  problem  of  bilure 
modes,  there  are  other  issues  where  the  medium 
becomes  relevant  properties  related  to  learning,  are  one 
example.)  But  for  most  purposes  where  people  think 
that  the  issue  is  the  medium  of  representation,  the  issue 
often  turns  out  to  be  one  that  can  be  formulated  at  the 
Knowledge  Level 

We  can  certainly  ask  similar  questions  about  the 
brain.  It  is  a  matter  of  fact  whether  the  brain  is  an 
information  processor  of  the  ‘physicalist*  type,  one  of 
the  connectionist  variety,  or  one  that  has  Turing 
machine-like  symbols.  (Putnam^  has  argued  that  even 
whether  a  piece  of  matter  is  a  Turing  machine  is  just  a 
stance,  but  we  think  that  the  consensus  is  that 
Putnam’s  argument  does  not  really  work,  and  that  not 
all  pieces  of  matter  can  be  interpreted  as  a  given  Turing 
machine.)  But  as  long  as  we  are  interested  in  aspects  of 
the  organbm’s  behavior  that  have  an  informational 
flavor  (such  as  decision-making),  talk  of  informa¬ 
tion  and  its  use  is  necessary  in  the  analysis,  just  as  it 
was  in  the  case  of  the  coin-sorter.  Much  of  the  criticism 
of  the  information  processing  view  (from  Edelman,  e.g.) 
of  information  processing  is  based  on  a  narrow  view  of 
what  the  information-processing  talk  commits  one  to. 
Conversely,  many  proponents  of  information  processing 
explanations  are  also  committed  to  such  a  narrow  view, 
making  far  more  commitments  about  internal  processes 
than  necessary. 

In  the  rest  of  the  article,  we  will  adopt  this  broad 
sense  of  information  processing  or  knowledge  sute 
account  as  a  stance  that  is  useful  in  descnbing  agents  to 
which  we  ascribe  cognitive  capacities. 


Archkcetom  for  faNdi^paec 

We  now  move  to  a  disciosion  of  architectural  proposals 
within  the  information  processing  perspective.  Our  goal 
is  to  try  to  place  the  multiplicity  of  proposals  into 
perspective.  As  we  review  various  proposals,  we  will 
present  some  judgements  of  our  own  about  rrievant 
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issues.  But  firsL  we  need  to  review  the  notion  of  an 
architecture  and  make  some  additional  distinctions. 

Form  and  content  issues  in  architectures 

In  computer  science,  a  programming  language  corres¬ 
ponds  to  a  virtual  architecture.  A  specific  program  in 
that  language  describes  a  particular  (virtual)  machine, 
which  then  responds  to  various  inputs  in  ways  defined 
by  the  program.  The  arctutecture  is  thus  what  Newell 
calls  the  fixed  structure  of  the  information  processor 
that  is  being  analysed,  and  the  program  specifies  a 
variable  strticture  srithin  this  architecture.  We  can 
regard  the  architecture  as  the  form  and  the  program  as 
the  content,  which  together  instantiate  a  particular 
iofonnation-prooessing  machine.  We  can  extend  these 
intuitions  to  types  of  machines  which  are  different  from 
computers.  For  example,  the  connectionist  architecture 
can  be  abstractly  spedfied  as  the  set  {{N},  {n,},  {ro}, 
{{<}•  {Af}  is  a  set  of  nodes,  {ii|}  arxl  {ro} 

are  sui»^  of  [N]  called  input  and  output  nodes 
respectivdy,  {(,}  are  the  functions  comput^  by  the 
nodes,  and  {w^}  is  the  set  of  weights  between  nodes.  A 
particular  connectionist  machine  is  then  instantiated  by 
the  ‘program’  that  specifies  values  for  all  these 
variaMes. 

We  have  made  a  distinction  between  an  architecture, 
the  form  in  which  the  architecture  will  accept  content 
(the  programming  language  form)  and  the  content  of 
the  representation  itself.  When  we  explain  specific  types 
of  cognitive  phenomena,  we  will  end  by  coming  up  with 
a  complex  budget  of  credit  allocation;  some  aspects  will 
be  explained  by  the  properties  of  the  architecture 
(perhaps  some  timing  phenomena,  and  also  some 
aspects  of  learning),  some  will  be  explained  by  the  sort 
of  information  that  is  involved  in  the  content.  Credit 
alfocation  in  this  matuier  is  a  tricky  analytic  task. 

We  also  need  to  make  an  additioi^  distinction 
between  micro-  and  macro-architectures,  a  distinction 
that  is  especially  useful  for  cognition.  A  bank  of 
information  processors  of  identical  type  connected  in 
some  way  has  a  macro-architecturd  description  in 
terms  of  the  modules  and  their  connections,  while  the 
entire  system  has  a  uniform  micro-architectural  descrip¬ 
tion. 

Many  AI  and  cognitive  sdenoe  theories  are  really 
theories  about  the  content  of  knowledge,  or  types  of 
knowledge,  needed  for  some  task  of  mteresL  with 
minimal  commitment  to  the  architecture.  Many  debates 
in  the  field;  whidi  are  ostensibly  about  the  architecture, 
turn  out  to  be  about  the  types  of  knowledge.  For 
example,  Dreyfus*  talks  about  ‘What  computers-  cannot 
do'.  It  turns  out  that  he  is  opposed  to  the  idea  that 
inleiligenoe  can  come  out  of  a  system  that  has  a 
knowledge  base  which  explidtly  and  exhaustively 
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represents  world  facts  and  relationships  in  some  logical 
form.  However,  there  are  people  within  computational 
AI  who  have  been  making  this  point  as  well.  For 
example.  Schank’  has  argued  that  our  knowledge  is  not 
in  the  above  form  of  abstract  facts  at  all,  but  rather  in 
the  form  of  experiences  indexed  and  abstracted  in 
various  ways.  Thus  the  issue,  at  least  based  on  Dreyfus' 
arguments,  is  not  what  computers  cannot  do,  but  what 
certain  kinds  of  knowledge  representations  cannot  do. 
It  may  turn  out  that  the  kind  of  information  that 
Dreyfus  sees  as  necessary  cannot  be  represented  in 
computers  either,  but  he  does  not  make  the  arguments 
for  this  position. 

We  are  now  ready  to  give  an  overview  of  the  issues  in 
cognitive  architectures.  We  will  assume  that  the  reader 
is  already  familiar  in  some  general  way  with  the 
proposals  that  we  are  discussing.  Our  goal  is  to  place 
these  ideas  in  perspective. 

Intelligence  as  Just  computation 

Until  recently  the  dominant  paradigm  for  thinking 
about  information  processing  has  been  the  Turing 
computer  framework,  or  what  has  been  called  the 
discrete  symbol  system  approach.  Information  processing 
theories  are  formulated  as  algorithms  operating  on  data 
structures.  In  fact  AI  was  launched  as  a  field  when 
Turing  proposed  in  a  fiunous  paper  that  thinking  was 
computation  (the  term  ‘artificial  intelligence’  itsdf  was 
coined  later).  A  natural  question  in  this  framework 
would  be  whether  the  set  of  computations  that  underlie 
thinking  is  a  subset  of  Turing-computable  functions, 
and  if  so.  how  the  properties  of  this  subset  should  be 
characterized. 

Because  of  the  technological  nature  of  much  of  AI, 
only  a  small  number  of  researchers  have  been 
concerned  with  intelligence  in  general  Most  of  the 
work  consists  of  algorithms  for  specific  problems  that 
seem  to  require  intelligence  and  that  are  practically 
important.  Algorithms  for  diagnosis,  design,  planning, 
etc.  are  proposed,  because  these  tasks  are  seen  as 
important  for  an  intelligent  agent  But  as  a  rule  no 
effort  is  made  to  relate  the  algorithm  for  the  specific 
task  to  a  general  architecture  for  intelligence.  While 
such  algorithms  are  useful  as  technologies  and  to  make 
the  point  that  several  tasks  that  appear  to  require 
intelligenoe  can  be  done  by  certain  classes  of  machines, 
they  do  not  give  much  insight  into  intelligence  in 
genera). 

Architectwrea  for  deliberation 

Historically  most  of  the  intuitiofu  in  AI  about 
intelligenoe  have  come  from  introspections  about 


human  consciousness,  specifically  about  what  people 
perceived  to  be  the  relationships  among  consdous 
thoughts.  We  are  aware  of  having  thoughts  which  often 
follow  one  after  another.  These  thoughts  are  mostly 
couched  in  the  medium  of  natural  language,  but 
sometimes  thoughts  include  mental  images  as  well 
When  peoi^  are  thinking  for  a  purpose,  say  for 
problem  solving,  there  is  a  sense  of  directing  thoughts, 
dioosing  some,  rejecting  others,  and  focusing  them 
towauds  the  goal.  Activity  of  this  type  has  been  called 
‘deliberation'.  Deliberation,  for  humans,  is  a  (xrherent 
goal-directed  activity,  lasting  over  several  seconds  or 
longer.  For  many  people  thinking  is  the  act  of 
deliberating  in  this  sense.  Activities  in  this  time  span 
should  be  contrasted  with  other  cognitive  phenomena, 
which,  in  humans,  take  under  a  few  hundred 
milliseconds:  real-time  natural  language  understanding 
and  generation,  visual  perception,  being  reminded  of 
things,  and  so  on. 

Different  kinds  of  theories  about  the  architecture  of 
the  cognitive  machine  have  been  proposed  depending 
upon  what  kinds  of  patterns  among  these  thoughts  the 
researchers  have  been  struck  by.  Two  groups  of 
proposals  about  such  patterns  have  been  influential  in 
AI  theory-making:  the  reasoning  view  and  the  goal- 
subgoal  view. 

Deliberation  as  reasoning.  People  have  for  a  long  time 
been  struck  by  logical  relations  between  thoughts  and 
have  made  the  distinction  between  rational  and 
irrational  thoughts.  Remember  that  Boote’s  book  on 
logic  was  titled  ‘Laws  of  Thought’.  Thoughts  often  have 
a  logical  relation  between  them:  we  think  thou^ts  A 
and  B,  then  thought  C,  where  C  follows  from  A  and  B. 
In  AI,  this  view  has  given  rise  to  an  idealizadon  of 
intelligence  as  rational  thought,  and  consequently  to 
the  view  that  the  appropriate  architecture  is  one  whose 
behavior  is  governed  by  rules  of  logic.  In  AI,  McCarthy 
is  most  dosely  identihed  with  the  logic  approadi  to 
AI,  and  ref.  10  is  considered  a  dear  early  statement  of 
some  of  the  issues  in  the  use  of  logic  for  building  an 
intelligent  machine.  Researchers  in  AI  disagree  about 
how  to  make  machines  whidi  display  this  Wind  of 
rationality.  One  group  proposes  that  the  ideal  thought 
machine  is  a  logic  machine,  one  whose  ardiitectuie  has 
logical  rules  of  inference  as  its  primitive  operators. 
These  operators  work  on  a  storehouse  of  knowledge 
represented  in  a  logical  formalism  and 
additional  thoughts.  For  example,  the  Japanese  Fifth 
generation  project  came  up  with  computer  architectures 
whose  performance  was  measured  in  (miiiw^  of) 
btferences  per  second.  The  other  group  befieves  that  the 
a^itectute  itself  (le.  the  mechanism  that  gfimitw 
thoughtt)  is  not  a  lo^  machme,  but  one  which 
generaies  plausMe.  but  not  necesnrily  cotiect.  thm^hts, 
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and  then  knowledge  of  correct  logical  patterns  is  used 
to  make  sure  that  the  conclusion  is  appropriate. 

Historically  rationality  was  characterized  by  the  rules 
of  deduction,  but  in  AI,  the  notion  is  being  broadened 
to  include  a  host  of  non-deductive  rules  under  the 
broad  umbrella  of  'non-monotonic  logic'*  *  or  'default 
reasoning*,  to  capture  various  plausible  reasoning  rules. 
There  is  considerable  diflerence  of  opinion  about  whether 
such  rules  exist  in  a  domain-independent  way  as  in  the 
case  of  deduction,  and  how  large  a  set  of  rules  would  be 
required  to  capture  all  plausible  reasoning  behaviors.  If 
the  number  of  rules  is  very  large,  or  if  they  are  context- 
dependent  in  complicated  ways,  then  logic  architectures 
would  become  less  practical 

At  any  point  in  the  operation  of  the  architecture, 
many  inference  rules  might  be  applied  to  a  situation 
and  many  inferences  drawn.  This  brings  up  the  control 
issue  in  logic  architectures,  i.e.  decision  about  which 
inference  rule  should  be  applied  when.  Logic  itself 
provides  no  theory  of  control.  The  application  of  the 
rule  is  guaranteed,  in  the  logic  framework,  to  produce  a 
correct  thought,  but  whether  it  is  relevant  to  the  goal  is 
decided  by  considerations  external  to  logic.  Control 
tends  to  be  task-specific,  i.e.  different  types  of  tasks  call 
for  different  strategies.  These  strategies  have  to  be 
explicitly  programmed  in  the  logic  framework  as 
additional  knowledge. 

Deliberation  as  goal-subgoaling.  An  alternate  view  of 
deliberation  is  inspired  by  another  perceived  relation 
between  thoughts  and  provides  a  bas.c  mechanism  for 
control  as  part  of  the  architecture.  Thoughts  are  often 
linked  by  means  of  a  goal-subgr  U  relation.  For 
example,  you  may  have  a  thought  about  wanting  to  go 
to  New  Delhi  then  you  And  yourself  having  thoughts 
about  taking  trains  and  airplanes,  and  about  which  is 
better,  then  you  might  think  of  making  reservations  and 
so  on.  Newell  and  Simon'^  have  argued  that  this 
relation  between  thoughts,  the  fact  that  goal  thoughts 
spawn  subgoal  thoughts  recursively  until  the  subgoals 
are  solved  and  eventually  the  goals  are  solved,  is  the 
essence  of  intelligence  as  a  mechanism.  More  than  one 
subgoal  may  be  spawned,  and  so  backtracking  from 
subgoals  that  did  not  work  out  is  generally  necessary. 
Deliberation  thus  looks  like  search  in  a  problem  space. 
Setting  up  the  alternatives  and  expiring  them  is  made 
possible  by  the  knowledge  that  the  agent  has.  In  the 
travel  example  above,  the  agent  had  to  have  knowledge 
about  diflerrat  possible  ways  to  get  to  New  Delhi  and 
knowledge  about  how  to  make  a  choioe  between 
alternatives.  A  long  term  memory  is  generally  proposed 
which  holds  the  knowledge  and  from  whidi  knowledge 
relevant  to  a  goal  is  brought  to  phy  durii^ 
tteUberation.  This  analysis  suggests  an  architecture  (or 
deliberation  which  retrieves  relevant  knowledge,  sets  up 
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a  set  of  alternatives  to  ex[riore  (the  problem  space), 
explores  il  seu  up  subgoals,  etc. 

The  most  recent  version  of  an  architecture  for 
deliberation  in  the  goal-subgoal  framework  is  Soar*. 
Soar  has  two  important  attributes.  The  first  is  that  any 
difficulty  it  has  in  solving  any  subgoal  simply  results  in 
the  setting  up  of  another  sul^oal  and  knowled^  from 
long  term  memory  is  brought  to  bear  in  its  solution.  It 
might  be  remembered  that  Newell's  definition  of 
intelligence  is  the  ability  to  realize  the  knowied^  level 
potential  of  an  agent  Ddiberation  and  goal  subgoaling 
are  intended  to  capture  that  capability:  any  piece  of 
knowledge  in  long  term  memory  is  available,  if  it  is 
rdbvanl  for  any  goal  Repeated  subgoaling  will  bring 
that  knowledge  to  deliberation.  The  second  attribute  of 
Soar  is  that  it  'caches’  its  successes  in  problem  solving 
in  its  long  term  memory.  The  next  time  there  is  a 
similar  goal  that  cached  knowledge  can  be  directly 
used,  instead  of  searching  again  in  the  corresponding 
problem  space. 

This  kind  of  deliberative  architecture  confers  on  the 
agent  the  potential  for  rationality  in  two  ways.  With  the 
ri|^t  kind  of  knowledge,  each  goal  results  in  plausible 
and  relevant  subgoals  being  setup.  Second,  ‘logical 
rules'  can  be  used  to  verify  that  the  proposed  solution 
to  subgoals  is  indeed  correct  But  such  rules  of  logic  are 
used  as  pieces  of  knowledge  rather  than  as  operators  of 
the  architecture  itself.  Because  of  this,  the  veriAcation 
rules  can  be  context-  and  domain-dependent. 

Another  point  to  note  is  that  one  of  the  results  of  this 
form  of  ddiberation  is  the  construction  of  special 
purpose  algorithms  or  methods  for  specific  problems. 
These  algorithms  can  be  placed  in  an  external  computa¬ 
tional  medium  and  as  soon  as  a  subgoal  arises  that 
such  a  method  or  algorithm  can  solve,  the  external 
medium  can  solve  it  and  return  the  results.  For 
example,  during  design  an  engineer  might  set  up  the 
subgt^  of  computing  the  maximum  stress  in  a  truss, 
and  invoke  a  Anite  element  method  running  on  a 
computer.  The  deliberative  engine  can  thus  create  and 
invoke  computational  algorithms.  The  goal-subgoaling 
architecture  provides  a  natural  way  to  integrate 
external  algorithms. 

In  the  Soar  view,  long  term  memory  is  just  an 
associative  memory.  It  has  the  capability  to  'recognize* 
a  situation  and  retrieve  the  relevant  pieces  of  know¬ 
ledge.  Because  of  the  learning  capability  of  the  archi¬ 
tecture.  each  episode  of  problem  solving  gives  rise  to 
continuous  impfovement  As  a  problem  comes  along, 
some  subtasks  are  solved  by  external  computational 
architectures  which  implement  spedd  purpose  algo¬ 
rithms.  while  others  are  directly  solved  by  cominled 
knowledge  in  memory,  while  yet  others  are  solved  by 
additioiial  deBberation.  *rMa  cycle  makes  the  overall 
system  increasingly  more  powerful  Eventually,  most 
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routine  problems,  including  real-time  understanding 
and  generation  of  natural  language,  are  solved  by 
recognition.  (Recent  work  by  Patten  et  al.^^  on  the  use 
of  compiled  knowledge  in  natural  language  under¬ 
standing  is  compatible  with  this  view.) 

Deliberation  seems  to  be  a  source  of  great  power  in 
humans.  Why  is  not  recognition  enough?  As  Newell 
points  ouL  the  particular  advantage  of  deliberation  is 
distal  access  to  and  combination  of  knowledge  at  run¬ 
time  in  a  goal-speciflc  way.  In  the  deliberative  machine, 
temporary  connections  are  created  between  pieces  of 
knowledge  that  are  not  hard-coded,  and  that  gives  it 
the  ability  to  realize  the  knowledge  level  potoitial 
more.  A  recognition  architecture  uses  knowledge  less 
effectively;  if  the  connections  are  not  there  as  part  of  the 
memory  element  that  controls  recognition,  a  piece  of 
knowledge,  though  potentially  relevant,  wilt  not  be 
utilized  in  the  satisfaction  of  a  goal. 

As  an  architecture  for  deliberation,  the  goal-subgoal 
view  seems  to  us  closer  to  the  mark  than  the  reasoning 
view.  AS  we  have  argued  elsewhere*  ^  logic  seems  more 
appropriate  for  justification  of  conclusions  and  as  the 
framework  for  the  semantics  of  representations  than  for 
the  generative  architecture. 

AI  theories  of  deliberation  give  central  importance  to 
human-level  problem  solving  and  reasoning.  Any 
continuity  with  higher  animal  cognition  or  brain 
structure  is  at  the  level  of  the  recognition  architecture 
of  memory,  about  which  this  view  says  little  other  than 
that  it  is  a  recognition  memory.  For  supporting 
deliberation  at  the  human  level,  long  term  memory 
should  be  capable  of  storing  and  generating  knowledge 
with  the  full  range  of  ontological  distinctions  that 
human  language  has. 

/s  the  search  view  of  deliberation  too  narrow?.  A 
criticism  of  this  picture  of  deliberation  as  a  search 
architecture  is  that  it  is  based  on  a  somewhat  narrow 
view  of  the  fuiKtion  of  cognition.  It  is  worth  reviewing 
this  argument  briefly. 

Suppose  a  Martian  watches  a  human  in  the  act  of 
multiplying  numbers.  The  human,  during  this  task,  is 
emulating  some  multiplication  algorithm.  i.e.  appears  to 
be  a  multiplication  machine.  The  Martian  might  well 
return  to  his  superiors  and  report  that  the  human 
cognitive  architecture  is  a  multi^ication  machine,  but 
we  know  that  the  multiplkaiion  architecture  is  a 
fleeting,  evanescent  virtual  architecture  that  emer^  as 
an  interaction  between  the  goal  (multiplication)  and  the 
procedural  knowledge  of  the  human.  With  a  different 
goal,  the  human  might  behave  like  a  different  machine. 
It  would  be  awkward  to  imagine  cognition  to  be  a 
collection  of  different  architectures  for  each  such  task; 
in  feet,  coition  is  very  plastic  and  is  able  to  simulate 
various  virtual  machines  as  needed. 


Is  the  problem  space  search  engine  that  has  been 
proposed  for  the  deliberative  architecture  such  an 
evanescent  machine?  One  argument  against  it  is  that  it 
is  not  intended  for  a  narrow  goal  like  multiplication, 
but  for  all  kinds  of  goals.  Thus  it  is  not  fleeting,  but 
always  operational. 

Or  is  it?  If  the  sole  purpose  of  the  cognitive 
architecture  is  goal  achievement  (or  'problem  solving'), 
then  it  is  reasonable  to  assume  that  the  architecture 
would  be  hard-wired  for  this  purpose.  What,  however,  if 
goal  achievement  is  only  one  of  the  functions  of  the 
cognitive  architecture,  common  though  it  might  be?  At 
least  in  humans,  the  same  architecture  is  used  to 
daydream,  just  take  in  the  external  world  and  enjoy  it. 
and  so  on.  The  search  behavior  that  we  need  for 
problem  solving  can  come  about  simply  by  virtue  of  the 
knowledge  that  is  made  available  to  the  agent's 
deliberation  from  long  term  memory.  This  knowledge  is 
either  a  solution  to  the  problem,  or  a  set  of  alternatives 
to  consider.  The  agent,  faced  with  the  goal  and  a  set  of 
alternatives,  simply  considers  the  alternatives  in  turn, 
and  when  additional  subgoals  are  set.  repeats  the 
process  of  seeking  more  knowledge.  In  fact,  this  kind  of 
search  behavior  happens  not  only  with  individuals,  but 
with  organizations.  They  explore  alternatives,  but  we 
do  not  see  a  need  for  a  fixed  search  engine  for 
explaining  organizational  behavior.  Deliberation  of 
course  has  to  have  the  right  sort  of  properties  to  be 
able  to  support  search.  Certainly  adequate  working 
memory  ne^s  to  be  there,  and  probably  there  are  other 
constraints  on  deliberation,  but  it  does  not  have  to  be 
exclusively  a  search  architecture.  Just  like  the  multipli¬ 
cation  machine  was  an  emergent  architecture  when  the 
agent  was  faced  with  that  task,  the  search  engine  is  the 
corresponding  emergent  architecture  for  the  agent  faced 
with  a  goal  and  equipped  with  knowledge  about  what 
alternatives  to  consider.  In  fact,  a  number  of  other  such 
emergent  architectures  built  on  top  of  the  deliberative 
architecture  have  been  studied  eailier  in  our  work  on 
Generic  Task  architectures*’.  These  architectures  were 
intended  to  capture  the  needs  for  specific  of 

goals  (such  as  classification). 

The  above  argument  is  not  to  deemphasize  the 
importance  of  problem  space  search  for  g^  achieve¬ 
ment.  but  to  resist  the  identification  of  the  architecture 
of  the  conscious  processor  with  one  exclusively 
intended  for  search.  The  problem  space  architecture  is 
still  important  as  the  virtual  architecture  for  goal- 
achieving,  since  it  is  a  common,  though  not  the  only, 
function  of  cognition. 

Of  course,  that  cognition  goes  beyond  just  goal 
achievement  is  a  statement  about  human  copiitkm.  If 
we  take  a  design  perspective  and  seek  to  specify  an 
architecture  for  a  function  called  intelligenoe  which 
itseff  n  defined  in  terms  of  goal  achievement,  then 
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clearly  we  are  free  to  design  an  architecture  best  suited 
for  that  purpose.  A  deliberative  search  architecture 
working  with  a  long  term  memory  of  knowledge 
certainly  has  many  attractive  properties  for  this 
purpose  as  we  have  discussed  in  this  section. 

Architectures  below  deliberation 

We  made  a  distinction  between  cognitive  phenomena 
that  occur  in  under  a  few  hundred  milliseconds  and 
those  that  evolve  over  longer  time  spans,  and  covered 
the  latter  under  deliberation.  We  will  call  the 
architecture  that  handles  the  former  phenomena 
subdeliberative.  In  deliberation,  we  have  access  to  a 
number  of  intermediate  states  in  problem  solving.  After 
you  finished  planning  the  New  Delhi  trip,  I  can  ask  you 
what  alternatives  you  considered,  why  you  rejected 
taking  the  train,  and  so  on.  and  your  answers  to  them 
will  generally  be  reliable.  You  were  probably  aware  of 
rejecting  the  train  option  because  you  calculated  that  it 
would  take  too  long.  On  the  other  hand,  we  have 
generally  no  clue  about  how  the  subdeliberative 
architecture  came  to  any  conclusion.  If  you  recognize 
someone  after  not  having  seen  him  for  twenty  years, 
and  that  person  expresses  surprise  by  asking,  ‘I  have 
changed  a  lot  in  twenty  years.  How  did  you  recognize 
meT,  you  may  come  up  with  something  like,  ‘I  bet  it  is 
your  nose!',  but  you  cannot  be  sure.  You  have  no  access 
to  how  your  perception  system  actually  recognized  that 
person.  Similarly,  you  may  have  your  own  theory  of 
why  you  were  reminded  of  something,  but  you  have  no 
speaal  access  to  what  went  on  during  that  reminding. 
Freud's  model  of  mind  had  complicated  unconscious 
processes  working,  and  in  fact,  in  this  view,  conscious* 
ness  was  often  misled  about  the  real  content  of  these 
unconscious  processes. 

Many  people  in  AI  and  cognitive  science  feel  that  the 
emphasis  on  complex  problem  solving  as  the  door  to 
understanding  intelligence  is  misplaced,  and  that 
theories  that  emphasize  rational  problem  solving  only 
account  for  very  special  cases  and  do  not  account  for 
the  general  cognitive  skills  that  are  present  in  ordinary 
people.  This  group  of  researchers  focus  almost 
completely  on  the  nature  of  the  subdeliberative 
architecture.  There  is  also  a  belief  that  the  subdelibera¬ 
tive  architecture  is  directly  reflected  in  the  structure  of 
the  neural  machinery  in  the  brain.  Thus,  some  of  the 
proposals  for  the  subdeliberative  architecture  claim  to 
be  inspired  by  the  structure  of  the  brain  and  claim  a 
biological  basis  in  that  sense. 

Alternative  proposals.  The  various  proposals  diiler 
along  a  number  of  dimensions:  what  kinds  of  tasks  the 
archite^ure  performs,  degm  of  parallelism,  whether  it 
is  an  information  processing  andiitecture  at  all.  and 
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when  it  is  taken  to  be  an  information  processing  archi¬ 
tecture,  whether  it  is  a  symbolic  one  or  some  other  type. 

With  respect  to  the  kind  of  tasks  the  architecture 
performs,  we  already  mentioned  Newell's  view  that  it  is 
just  a  recognition  architecture.  Any  smartness  it 
possesses  is  a  result  of  good  abstractions  and  good 
indexing,  but  architecturally,  there  is  nothing  particularly 
complicated.  In  facL  the  good  abstractions  and 
indexing  themselves  were  the  result  of  the  discoveries  of 
deliberation  during  {Moblem  state  search.  Being  smarter, 
from  the  Newell  perspective,  is  done  by  converting 
more  and  more  deliberative  problems  into  stored 
recognition  patterns  through  chunking.  The  real 
solution  to  the  problem  of  memory,  for  Newell,  is  to  get 
chunking  done  right:  the  proper  level  of  abstraction, 
labeling  and  indexing  is  all  done  at  the  time  of 
chunking.  Theories  of  memory  representation  (such  as 
Schank's)  are  in  this  sense  content  theories  of  indices 
and  labels,  not  architectural  theories.  Such  content 
theories  of  memory  are  not  really  in  conflict  with  the 
Neweil  theory  of  deliberative  architecture,  since  the 
latter  merely  gives  a  way  for  the  content  to  come  to  be 
the  way  it  is. 

In  contrast  to  the  recognition  view  are  proposals  that 
see  relatively  complex  problem  solving  activities  going 
on  in  subdeliberative  cognition.  Minsky**  originally 
proposed  a  specific  architecture  for  memory  based  on 
frames,  which  are  organized  as  a  network  of  concepts, 
each  of  which  contained  prototypical  information 
about  the  concept.  Relatively  complex  procedures  were 
embedded  in  these  concepts.  More  recently,  he  has 
outlined  a  Society  of  Mind”  architecture  for  cognition. 
Cognition  in  this  picture  is  a  communicating  collection 
of  modular  Tigents.  each  of  whom  is  simple,  but  capable 
of  some  degree  of  problem  solving.  For  example,  they 
can  use  the  means-ends  heuristic  (the  goal-subgoaling) 
feature  of  deliberation  in  the  Soar  architecture). 

Deliberation  has  a  serial  character  to  it.  Almost  all 
proposals  for  the  subdeliberative  architecture,  however, 
use  parallelism  in  one  way  or  another.  Parallelism  can 
bring  a  number  of  advantages.  For  problems  involving 
similar  kinds  of  information  procesang  over  somewhat 
distributed  data  (like  perception),  parallelism  can  speed 
up  processing.  Some  problems  that  require  explicit 
search  if  done  serially  can  be  done  without  search  in  a 
parallel  architecture.  For  example,  perception  problems 
often  involve  evaluating  a  number  of  alternative 
interpretations  and  choosing  the  best.  These  alteraa- 
tives  can  be  simultaneously  assessed  in  parallel  and  the 
best  picked.  Ultimately,  however,  additional  problem 
solving  in  deliberation  may  be  requited  for  some  tasks. 

Within  the  school  that  views  the  subdeliberative 
architecture  as  representation-processing,  there  has 
been  a  debate  about  the  medium  in  which  information 
is  represented.  Turing  computational  architectures  have 
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been  the  representational  frameworks  of  choice  for 
modeling  deliberation.  For  subdeliberation,  the  same 
framework  was  used  until  connectionism  came  along. 
Connectionism  replaced  the  explicit  processing  of 
symbolic  tokens  with  a  specific  type  of  analog 
computation.  The  original  conncctionist  proposal  of  the 
PDF  type"  were  in  some  ways  less  powerful  than 
Turing  machines.  For  example,  it  had  to  face  the 
criticism  that  that  kind  of  computation  cannot  account 
for  the  systematicity  and  generativity  of  natural 
language  which  requires  variable  binding  and  symbols 
of  some  type'*,  requirements  which  the  Turing- 
computational  framework  can  handle  well.  A  number 
of  ways  of  enlarging  the  connectionist  frameworks  to 
give  them  these  capabilities  have  been  proposed.  Some 
involve  using  explicit  symbols  in  conncctionist  repre¬ 
sentations  (see  for  example,  ref.  19),  while  others  involve 
representations  that  have  some  of  the  properties  of 
symbols  without  being  symbols  in  the  Turing-computa¬ 
tional  sense  (see  for  example,  ref.  20).  In  any  case,  most 
of  these  conncctionist  proposals  are  actually  imple¬ 
mented  and  simulated  in  digital  computers,  and  none  of 
the  functions  that  they  compute  are  outside  the  Turing 
framework.  The  problem  does  not  really  seem  to  be 
with  Turing  computation  per  se,  but  rather  the  way  in 
which  Turing  computation  has  been  used  in  Ai  and 
cognitive  science,  namely  as  applications  of  inference  on 
axiomatically  represented  world  knowledge. 

Connectionism  has  been  evolving  in  a  number  of 
directions.  A  proposal  that  has  been  gaining  currency  is 
that  the  information  processing  of  the  brain  is  a 
dynamical  system-'  defined  by  complex  nonlinear 
differential  equations.  It  has  been  claimed,  for  example, 
that  chaos  may  be  useful  as  a  creative  device  for  new 
states  in  a  search^-,  and  that  dynamic  systems  at 
criticality  have  the  unbounded  dependencies  characteri¬ 
stic  of  context-sensitive  grammars’^. 

Edelman  argues  strongly  against  information  pro¬ 
cessing  theories  of  cognition  on  the  ground  that  they 
require  a  prelabeled  world  of  objects  and  relations, 
whereas  biological  organisms  in  his  view  discover 
patterns  as  regularities  in  their  interactions  with  the  world 
rather  than  start  with  prelabeled  information.  He  also 
argues  against  connectionism  since  he  thinks  they 
require  some  form  of  prelabeled  information  as  well. 
His  architectural  proposal  is  not  couched  as  computa¬ 
tion  on  representations,  but  as  one  in  which  successful 
neural  pathways  are  selected  in  a  process  similar  to 
Darwinian  evolution.  The  selection  is  done  in  response 
to  the  physical  interaction  of  the  organism  with  the 
external  world.  This  process  results  in  neural  structures 
which  categorize  the  organism's  interaction  with  the 
world,  but  these  are  not  fixed  logical  categories,  but 
flexible,  constantly  changing  ones,  to  reflect  the 
organism's  continuing  interaction.  Edelman  has  proposed 


additional  mechanisms  by  which  these  structures 
develop  higher  and  higher  order  categorizations  and 
coordinations. 

The  motivation  behind  connectionism  and  its 
offshoots  is  generally  couched  as  opposition  to 
symbolic  computation,  and  Edelman  argues  against 
information  processing,  but.  as  we  have  argued  earlier, 
the  real  opposition  seems  to  be  to  the  idea  of  a 
representational  repertoire  that  corresponds  to  the 
theories  of  the  external  world  of  objects  and  relations 
that  we  conceptualize  in  our  conscious  models  of  the 
world.  There  is  a  widespread  suspicion  that  A I  and 
cognitive  science  have  confused  the  externally  visible 
constructions  of  mind  (explicit  knowledge  of  the  world, 
grammars,  etc.)  as  the  raw  material  of  mind.  In  this 
view,  just  because  we  seem  to  be  using  pieces  of 
knowledge  in  our  deliberation  does  not  mean  that  this 
knowledge  was  represented  in  that  form  in  memory. 
The  phrase  'information  processing'  has  been  too 
closely  associated  with  the  view  that  what  is  inside  the 
mind  is  much  like  what  we  seem  to  have  in  our 
consciousness.  The  opposing  view  is  that  whatever  is 
inside  us  is  not  in  the  form  of  abstract  statements  of 
facts  about  the  world,  but  rather  is  concretely  tied  to 
our  interaction  with  the  physical  world,  flexible,  open- 
ended.  and  constantly  changing  with  each  interaction. 

With  this  proviso  accepted,  we  can  take  a  representa- 
.tional  stance  towards  connectionist  networks  as  well  as 
Edelman's  selection  machine.  In  that  sense  of  attributed 
information  or  knowledge  that  we  argued  for  in  our 
discussion  of  the  coin-sorter.  Edelman's  organism  has 
knowledge  and  information.  We  can,  from  outside, 
watch  an  Edelmanian  brain  at  some  point  in  its 
evolution,  and  say  things  like.  ‘This  organism  knows 
about  X.  but  not  about  y.’  In  the  broad  sense  of 
information  processing  that  we  have  been  advocating. 
Edelman's  organism  is  an  information  processing  agent 
and  its  neural  pathways  represent  knowledge.  If 
knowledge  of  the  world  can  be  in  the  form  of  on-going 
abstractions  of  experience,  which  at  the  Knowledge 
Level,  can  be  interpreted  as  partial,  but  increasingly 
more  veridical,  knowledge  of  the  world,  then  all  these 
approaches  qualify  as  information  processing  theories. 

Is  there  a  'right'  architectural  throry  of  subdelibera¬ 
tion'.’  Later  in  the  article  we  discuss  how  to  place  the 
various  alternative  proposals  in  useful  relations  to  each 
other. 

So  far  wc  have  talked  about  the  micro-architecture  of 
the  subdeliberative  .system.  A  few  brief  comments  on 
macro-architecture  arc  relevant.  Fodor-*  has  proposed 
the  Mixlulariiy  Hypothesis  which  asserts  that  there  are 
separate  modules  for  each  of  the  perceptual  modalities, 
the  language  modality  and  central  cognition.  That  is. 
there  is  relatively  little  interaction  between  them  until 
the  perceptual  and  language  modules  have  completed 
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their  interpretation  tasks.  These  interpretations  are 
available  in  the  working  memory  of  deliberation.  There 
is  some  debate  about  how  much  information  flow  is 
there  from  one  modality  to  another  during  recognition, 
but  there  is  general  consensus  that  the  degree  of 
intermodality  information  flow  is  small  in  comparison 
with  the  information  processing  within  each  module. 

Situated  cognition.  Real  cognitive  agents  are  in  contact 
with  the  surrounding  world  containing  physical  objects 
and  other  agents.  A  new  school  has  emerged  calling 
itself  the  situated  cognition  movement  which  argues  that 
traditional  AI  and  cognitive  science  abstract  the 
cognitive  agent  too  much  away  from  the  environment, 
and  place  excessive  emphasis  on  internal  representa¬ 
tions.  The  traditional  internal  representation  view  leads, 
according  to  the  situated  cognition  perspective,  to 
excessive  amounts  of  internal  representation  and 
complex  reasoning  using  these  representations.  Real 
agents  simply  use  their  sensory  and  motor  systems  to 
explore  the  world  and  pick  out  the  information  needed, 
and  get  by  with  much  smaller  amounts  of  internal 
representation  processing.  At  the  minimum,  situated 
cognition  is  a  proposal  against  excessive  ‘intellection*. 
In  this  sense,  wc  can  simply  view  this  movement  as 
making  different  proposals  about  what  and  how  much 
needs  to  be  represented  internally.  However,  there  are 
more  radical  versions  of  the  movement  in  which  any 
internal  representation  is  denied.  Spedflcaily,  the 
movement  rejects  the  idea  that  knowledge  is  represented 
in  the  brain  and  retrieved  as  needed,  but  instead  holds 
that  knowledge  is  constructed  by  the  agent  in  a 
complex  interaction  between  neural  processes  and  the 
external  situation.  ‘[Representations]  are  the  product  of 
interactions,  not  a  fixed  substrate  from  which  behavior 
is  generated*’’.  The  reader  will  recognize  that  this  view 
is  close  to  that  of  Edelman.  This  constructivist  view  of 
knowledge  is  a  major  dividing  line  between  traditional 
‘knowledge  representation*  view  in  AI  and  the  situated 
cognition  view.  To  take  an  example,  schema  theories  in 
psychology  and  frame  theories  in  AI  have  held  that 
memory  is  organized  in  terms  of  schemas,  stereotyped 
concepts  or  events.  The  newer  view  would  hold  that 
such  schemas  are  actually  constructed  in  response  to 
the  situation,  not  units  of  memory  representation  and 
organization**. 

In  our  discussions  so  far.  we  have  presented  two 
different  views  on  internal  representations.  On  the  one 
hand,  we  have  representations  in  the  traditional  AI 
sense  of  explicit  encoding  of  facu  and  so  on,  and  on  the 
other  hand,  we  also  said  that  one  can  often  take  an 
external  Knowledge  Level  stance  towards  the  content 
of  knowledge  that  is  implied  by  an  agent*s  behavior. 
The  situated  cognition  perspective  dearly  rejects  the 
former  view  with  respect  to  internal  (sut^liberative) 
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processes,  but  accepts  the  fact  deliberation  does  contain 
and  use  knowledge.  Thus  the  Knowledge  Level 
description  could  be  useful  to  describe  the  content  of 
agent*s  deliberation.  But  the  perspective  emphasizes  the 
issues  relevant  to  the  nature  of  the  neural  level 
descriptions  and  the  processes  which  work  with  the 
external  situation  to  construct  the  representations  in 
deliberation. 

The  movement  raises  many  important  issues,  but  the 
solution  to  the  problem  of  what  sort  of  neural  processes 
exist  and  how  the  interactive  process  constructs 
representation  is  still  in  the  future. 

Integrating  the  perspectives 

An  integrated  view  of  problem  solving.  We  briefly 
outline  how  the  major  components  of  the  cognitive 
architecture  work  together  in  the  solution  of  complex 
problems.  The  agent  is  embedded  in  the  physical  world, 
receives  sensory  information,  and  acts  on  the  world. 
Deliberation  is  the  central  co-ordinating  architecture, 
and  its  working  memory  can  contain  both  symbolic 
and  imagistic  data,  constructed  out  of  long  term 
representations  in  response  to  the  goal  at  hand,  as  the 
situated  cognition  movement  proposes.  Memory  can  be 
viewed  at  the  Knowledge  Level  as  containing  this 
information,  but  this  talk  should  not  mislead  one  into 
thinking  that  the  information  that  is  in  working 
memory  was  in  that  form  in  long  term  memory  (see  our 
discussion  on  situated  cognition).  The  agent  also  has 
action  repertoires  which  can  be  thought  of  as  a  form  of 
memory,  but  information  representational  talk  is  much 
less  appropriate  for  describing  them. 

The  degree  of  abstract  problem  solving  required 
depends  on  the  kind  of  goal.  Many  goals  can  be  simply 
solved  by  means  of  one  or  more  of  the  action 
repertoires,  with  little  mediation  from  anything  that 
one  might  call  problem  solving  in  the  sense  of 
manipulation  of  representations  of  choices  in  a  search 
space.  The  goal-action-sensory  system  triple  is  highly 
evolved  and  integrated  to  carry  ouL  in  a  goal-driven 
way,  such  action  sequences. 

When  such  action  sequences  are  not  immediately 
available  for  the  goal  there  are  a  number  of  options. 
Working  memory  may  contain  abstract  representations 
of  problem  space  alternatives.  The  problem  space  and 
the  operators  available  may  have  not  only  abstract 
symbolic  components,  but  imagistic  components  as 
well  Working  memory  may  also  contain  previously 
developed  sequences  of  solutions  or  pointers  to  external 
methods,  algoiithms,  or  models.  Some  of  the  subgoals 
are  best  accompHshed  by  action  sequences,  some  by 
operators  that  are  specific  to  the  image  modality  (e.g. 
reasoning  with  mental  imafes),  some  by  application  of 
abstract  knowledge  operators,  and  some  by  invoking 
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external  agents  and  models.  Many  of  the  subgoals  can 
be  accomplished  just  by  interacting  with  the  world  or 
sensing  the  world  rather  than  by  reasoning  on  complex 
representations.  A  common  way  of  avoiding  complex 
reasoning  is  to  leave  representational  markers  in  the 
physical  world,  and  use  action  and  sensory  operators  to 
‘read  olT  the  information. 

The  above  description  emphasizes  how  much  of  real 
problem  solving  is  dominated  by  the  (act  that  the  agent 
is  situated  in  the  world  and  how  artificial  a  pure 
symbolic  representation  manipulation  view  can  be  for 
many  problems.  At  the  same  time,  the  above  picture  is 
admittedly  schematic.  A  number  of  important  issues 
remain  unsolved.  We  already  referred  to  the  problem  of 
the  mechanisms  by  which  knowledge  in  working 
memory  is  constructed  in  response  to  goals.  How  the 
sensor-action  system  is  integrated  with  deliberation  in 
an  abstract  sense  requires  many  details  to  be  worked 
out.  but  it  sets  a  research  agenda  that  is  different  from 
that  of  traditional  Al. 

Content  Jriren  A I  and  microstructural  accounts  are  both 
needed.  In  a  strange  way.  the  perspective  we  just 
outlined  validates  both  traditional  AI  and  the  new 
emphasis  on  microstructure.  Traditional  Al.  with  its 
emphasis  on  knowledge  and  the  distinctions  needed  to 
express  it.  has  tried  to  wrestle  content  down.  It  has 
been  able  to  do  this  pretty  well  up  to  a  point,  but 
because  it  is  not  embedded  in  a  theory  with  appropriate 
microstructure  and  environmental  interaction,  ends  up 
over-idealizing  content  and  missing  the  form  in  which 
knowledge  really  emerges.  The  microstructural  accounts 
have  potential  to  explain  the  genesis  and  evolution  of 
knowledge,  and.  to  the  extent  that  they  base  themselves 
on  some  aspects  of  biological  neural  systems,  can 
explain  aspects  of  continuity  in  cognition  between 
higher  animals  and  humans.  It  is  also  often  hoped  that 
the  content  problem  in  AI  can  be  solved  by  Al  systems 
that  learn  from  scratch  or  with  little  initial  knowledge. 
That  is.  the  hope  is  that  learning  will  obviate  the  need 
to  develop  knowledge  level  distinctions.  That  seems 
highly  unlikely  for  reasons  of  complexity,  both  in  time 
and  in  the  environmental  specincatiun.  but  also  due  to 
the  need  for  specifying  appropriate  initial  states.  It  is 
more  likely  that  the  learning  theories  will  give  broad 
insights  about  content  that  might  place  useful  constraints 
on  knowledge  level  theories.  Thus  the  content-driven 
AI  picture  and  the  microstructurc-drivcn  new  archi¬ 
tectural  views  need  to  work  side  by  side  fur  quite  a 
while,  hoping  to  meet  in  various  ways  and  places  for 
mutual  benefit. 

Hierarchy  of  leaky  architectures  and  cognitive  e.xplana- 
tions.  We  have  mentioned  conncctionism.  dynamical 
systems,  and  Edelmans  selection  machine  as  three 


contending  proposals  for  the  subdeliberative  architec¬ 
ture,  and  no  doubt  there  will  be  many  others  over  time. 
But  to  look  for  a  ‘correct*  answer  to  the  cognitive 
architecture  may  be  to  commit  an  error  in  reification,  in 
believing  that  there  exists  one  architecture  that  can  be 
factored  off  the  physical  brain  in  such  a  way  that  the 
architecture  corresponds  to  and  only  to  cognition  (or 
more  generally  mentality).  In  the  introductory  section 
on  dimensions  for  thinking  about  thinking,  we 
discussed  the  problems  associated  with  factoring  off  a 
cognitive  architecture  h-om  a  mental  architecture.  A 
similar  issue  arises  in  the  belief  that  a  mental 
architecture  can  be  factored  off  the  physical  brain  or 
the  body,  and  that  a  clearly  defined  set  of  functionalities 
can  be  identified  to  define  mind.  What  we  have  in  the 
brain  is  a  biologically  evolved  complex  piece  of  matter 
working  at  many  levels,  informational,  chemical  and 
electrical.  Certainly  different  stances  can  be  taken 
towards  it  for  different  analytical  purposes,  but 
believing  that  there  exists  a  separable  architecture 
called  the  mental,  especially  one  that  has  a  description 
at  one  level,  may  be  Platonism  run  amok. 

If  this  view  is  right,  then  we  can  see  the  contending 
proposals  for  the  subdeliberative  architecture  as 
approximate  descriptions,  at  somewhat  different  levels, 
of  a  physical  reality  called  brain,  which  in  turn  is  the 
basis  for  a  host  of  behaviors  that  have  a  mentalistic 
description. 

Consider  the  mathematical  description  of  an  economy 
in  a  human  society.  It  would  be  strange  to  regard  the 
economy  as  the  reality  which  just  happens  to  be 
implemented  on  humans.  Description  of  an  economic 
model  is  an  approximate  description  of  certain  types  of 
activities  in  human  society.  This  is  the  analogy  that  we 
would  like  the  reader  to  keep  in  mind  as  we  describe 
our  view  of  hierarchy  of  cognitive  architecture 
descriptions. 

In  this  view,  the  Edelman  seleaion  machine  is  a 
convenient  and  approximate  description  of  a  machine 
which  is  really  a  complex  chemical  machine.  At  a 
higher  level,  dynamical  systems  provide  another 
approximate  description,  with  connectionist  descrip¬ 
tions  providing  yet  another  level  of  description.  We 
think  that  when  the  selection  machine  organizes  itself 
to  perform  some  task,  say  perception,  it  should  be 
possible  to  see  in  it  a  description  of  evidences  being 
combined,  the  language  in  which  connectionism  works. 
At  the  top  level  we  have  the  knowledge  level 
description  of  the  agent  in  terms  of  knowledge  and 
goals.  Each  of  these  descriptions  captures  some  aspects 
and  functionalities,  but  misses  others. 

However,  this  picture  of  virtual  machines  all  lined  up 
vertically,  the  deliberative  architecture  on  top  of  the 
recognition  architecture  on  top  of.  say,  a  dynamical 
systems  architecture  which  in  turn  is  on  top  of 


172 


CURRENT  SaENCE.  VOL  M.  NO  ^  :S  M.ARCH  I99J 


;vrtificial  intelligence 


something  else  and  so  on  all  the  way  down  to  chemistry 
and  physics,  might  give  a  false  picture  of  perfect 
implementations  of  a  higher  level  by  a  lower  level 
Biological  brains  do  not  really  have  cleanly  lined  up 
architectures  in  the  way  that  computers  do.  In  artifacts 
like  computers,  we  as  designers  have  conceptualized  a 
pure  information  processing  machine  and  have  created 
a  complete  one-to-one  correspondence  between  the 
elements  of  that  and  the  elements  of  a  physical 
machine.  Except  when  the  machine  malfunctions  we 
never  have  to  worry  about  the  lower  level  machine.  In 
computer  software  design  each  level  of  architecture, 
each  virtual  machine,  sits  cleanly  upon  the  one  beneath 
it  without  the  one  beneath  it  showing  through  at  all 
Each  level  is  smooth  and  closed  and  separate  with 
respect  to  other  levels  of  the  architecture.  This  sort  of 
architectural  arrangement  has  guided  much  of  our 
thinking  about  human  cognitive  architecture. 

However,  in  a  biologically  evolved  object  like  the 
human  brain  such  a  clean  separation  between  levels  of 
architecture  and  between  software  and  hardware  is 
impossible.  This  is  because,  first  of  all  these  archi¬ 
tectures  we  have  been  describing  are  all  ‘leaky’  virtual 
machines.  That  is,  when  the  surface  structures  are 
stressed,  or  under  certain  situations,  the  underlying 
machine  shows  through.  There  are  layers  of  representa¬ 
tional  structures  and  representations  from  other  layers 
peak  through  at  any  given  layer.  Like  in  the  case  of 
vision,  where  in  ceruin  opticd  illusions  the  physical 
structure  of  rods  and  cones  shows  through  the 
interpretive  architecture,  the  architecture  of  the  under¬ 
lying  machines  literally  shows  through  in  certain 
circumstances.  The  cognitive  phenomena  are  thus  not 
all  going  on  at  one  level  of  architecture.  Secondly,  these 
layers  of  architectures  arc  not  complete,  i.e.  each  level 
of  description  does  not  fully  account  for  all  the 
phenomena  of  interest.  Given  some  complex  mental 
activity,  explanation  of  some  aspects  can  be  given  by 
the  KnowMge  Level  for  some  we  will  need  to  appeal 
to  the  properties  of  the  connectionist  architecture,  for 
some  to  the  properties  of  the  selection  machine,  and  for 
others  we  may  simply  need  to  appeal  to  chemistry  and 
other  physical  properties. 

What  description  we  use  to  account  for  the 
phenomena  depends  upon  our  goals.  The  cognitive 
phenomena  we  are  looking  at  are  not  going  to  admit  of 
any  single  level  of  explanation.  They  are  very  multi¬ 
dimensional  and  for  some  purposes  we  can  account  for 
the  behavior  by  referring  to  the  deliberative  machine, 
but  for  other  purposes  that  will  not  do,  and  we  will 
have  to  account  for  the  behavior  by  reference  to  a 
lower  level  of  the  architecture.  This  means  that  the 
information  processing  archilectutes  that  we  see 
underlying  human  cognitive  behavior  are  architectures 
that  we  have  abstracted  for  certain  dames  of  purposes. 
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This  is  not  to  espouse  a  form  of  relativism,  however. 
Not  everything  counts.  There  are  lots  of  machines  that 
could  not  be  brought  up  as  virtual  machines  by  the 
brain.  Interestingly,  all  the  virtual  machines  that  we 
considered,  from  Soar  to  connectionist  systems  to 
Edelman’s  path  selection  machines,  have  a  special 
feature:  they  all  are  oriented  towards  adaptation  and 
leaming.  Thus,  there  is  a  relationship  between  leama- 
bility  and  being  capable  of  being  a  virtual  machine  of 
interest  There  are  facts  of  the  matter  to  be  investigated 
and  discovered.  We  can  ask  of  a  proposed  virtual 
machine,  what  work  does  it  do?  How  is  it  useful  as  level 
of  explarution?  We  can  also  ask  of  a  particular  task 
how  is  it  being  done?  What  sort  of  architecture  is  being 
used  to  accomplish  it?  Although  we  can  potentially 
model  each  individual  function  of  cognitioit  there  may 
be  no  abstract  platonic  engine  which  accounts  for  all 
and  only  cognitive,  or  all  and  only  mental,  behavior. 
There  may  well  be  just  various  cognitive  functions  and 
various  machines  that  can  be  used  to  explain  those 
functions. 


CoBcInding  remarks 

We  started  by  asking  how  far  intelligence  or  cognition 
can  be  separated  from  mental  phenomena  in  general. 
We  also  suggested  that  the  problem  of  an  architecture 
for  cognition  is  not  really  well-posed,  since,  depending 
upon  what  aspects  of  the  behavior  of  biological  agents 
m  included  in  the  functional  specification,  there  can  be 
dilferent  constraints  on  the  architecture.  That  is,  it  is 
not  dear  tliai  from  an  architectural  perspective,  the 
idea  of  a  .cognitive  architecture  is  a  natural  kind. 
Nevertheless,  we  said,  we  can  talk  about  cognition  as  a 
coherent  phenomenon  of  interest  if  we  think  of  it  as 
that  behavior  in  which  we  ascribe  knowledge  states  to 
the  agent  NewelTs  Knowledge  Levd  view  of  an  agent  is 
based  on  a  similar  point  of  view  about  a  cognitive 
agent 

We  reviewed  a  number  of  issues  and  proposals 
relevant  to  cognitive  architectures.  The  computer 
metaphor  has  had  its  day,  but  we  argued,  the 
information  processing  language  has  significant  expla¬ 
natory  powers  left.  We  ended  with  the  position  that  the 
search  for  an  architectural  levd  that  will  explain  all  the 
interesting  phenomena  of  cognition  was  likdy  to  be 
futile.  Not  only  are  there  many  levels  each  explaining 
some  aspect  of  cognition  and  mentality,  but  the  levels 
interact  even  in  relativdy  simple  cognitive  phenomena. 
Ultimatdy  even  physics  will  account  for  some  mental 
phenomena. 

By  treating  mentality,  not  to  speak  of  its  cognitive 
componenL  as  ultimatdy  not  fully  separable  from  the 
physical  substrate,  we  are  not  bdng  pessimistic  about 
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the  prospects  for  cognitive  science  and  AI.  just  being 
careful  about  what  one  might  expect.  In  one  sense,  this 
view  reinforces  the  arguments  for  the  need  for 
grounding^^,  and  being  and  growing  as  real  humans,  as 
the  ultimate  requirement  for  achieving  the  kind  of 
mentality  that  we  have.  On  the  other  hand,  explana¬ 
tions  of  all  sorts  of  mental  phenomena  can  come  at 
various  levels.  We  can  build  problem  solvers,  perceivers, 
cognizers  and  so  on,  and  depending  upon  their  physics 
they  may  have  their  own  version  of  mentality.  There  is 
no  need  for  A I  or  cognitive  science  to  insist  on  the 
various  Separability  Hypotheses  being  true  in  all  details 
for  getting  nearer  and  nearer  to  the  goals  of 
explanation  and  simulation  of  mind. 
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Abstract 

A  common  view  of  reasoning  in  cognitive  science  is  that  it  is  a  process  that  operates 
on  abstract  sentential  representations.  This  view  implies  a  separation  of  reason¬ 
ing  from  sensory  perception.  Consequently,  the  study  of  perception  has  proceeded 
relatively  independently  of  the  study  of  various  reasoning  strategies  that  humans 
employ.  In  this  paper  we  argue  that  there  are  many  commonsense  situations  in 
which  human  reasoning  is  tightly  coupled  with  perception,  in  particular  with  per¬ 
ceptually  represented  experiential  knowledge.  This  type  of  reasoning  is  referred  to 
as  perceptual  reasoning.  We  explain  perceptual  reasoning  in  terms  of  experientially 
acquired  perceptual  inference  rules,  and  briefly  discuss  how  this  relates  to  a  previ¬ 
ous  proposal  about  representations  that  underlie  visual  perception  and  imagery  [5]. 
Finally,  the  implications  of  this  stance  are  discussed. 

4.1  Introduction 

Professor  Yoh-Han  Pao’s  research  interests  have  spanned  a  wide  variety  of  issues 
in  Artificial  Intelligence.  The  subject  of  this  paper,  we  think,  reflects  at  least  some  of 
the  range  of  Prof.  Pao’s  interests:  pattern  recognition,  vision  and  problem  solving. 
We  are  pleased  to  discuss  in  this  paper  the  close  integration  between  perception  and 
problem  solving  that  exists  in  people  and  make  some  proposals  about  how  computers 
can  be  designed  which  take  advantage  of  diagrams  and  images  in  problem  solving. 

Sensation,  perception,  cognition,  and  reasoning  have  all  been  subjects  of  study 
by  cognitive  psychologists.  How  are  these  related?  One  answer,  from  an  information 

This  paper  is  a  revision  of  a  talk  presented  at  the  AAAI  Spring  Symposinm  on  Reasoning  with 
Diagrammatic  Representations,  Stanford  University,  Palo  Alto,  California,  USA,  March  2S-27, 1992. 
An  earlier  version  appears  in  the  Working  Notes  of  the  symposium,  pp.  24-29. 


lituUigmt  Sysumt.  Edited  hy  L.S.  Sterling 
Plenum  Press.  New  York.  1993 
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processing  perspective,  is  that  reasoning  operates  on  information  about  the  world 
around  ns.  This  information  is  made  available  to  the  cognitive  system  through  the 
processes  of  sensation,  perception,  and  cognition.  Are  these  four  processes  strictly 
sequential,  influencing  each  other  in  a  unidirectional  way?  While  it  seems  obvi¬ 
ous  that  sensation  must  precede  perception,  it  is  not  evident  that  cognition  must 
strictly  follow  perception  or  that  reasoning  must  operate  solely  on  post-cognitive 
representations.  The  work  of  Biederman  [4]  on  human  object  recognition,  for  ex¬ 
ample,  suggests  that  perception  of  a  few  shapes  in  an  object  can  trigger  recognition 
which  then  can  influence  perception.  Deliberative  reasoning  has  been  typically  char¬ 
acterized  as  operating  on  post-cognitive  representations  and  therefore  divorced  from 
acts  of  perception  that  produced  them.  However,  the  view  that  human  reasoning 
and  behavior  are  tightly  coupled  with  (situated  in)  perception  is  gaining  increas¬ 
ing  credence.  An  excellent  illustration  of  perceptually  grounded  reasoning  is  given 
in  Shrager's  account  of  commonsense  perception  [19].  Another  confounding  phe¬ 
nomenon  in  the  visual  modality  has  been  that  of  mental  imagery  [20].  Any  theory 
of  visual  perception  and  cognition  that  postulates  some  underlying  representations 
and  mechanisms  has  to  account  for  this  phenomenon. 

Humans  are  adept  at  making  plausible  inferences  in  situations  in  which  the  rea¬ 
soning  employed  typically  involves  perception,  cognition,  and  imagination.  Ideas 
emerging  from  recent  research  (such  as  those  cited  above),  namely,  that  perception 
and  cognition  influence  each  other,  that  any  theory  of  perceptual  representations 
and  mechanisms  should  account  for  imagery  also,  and  that  reasoning  is  sometimes 
tightly  coupled  with  perception,  are  very  relevant  to  analyzing  commonsense  rea¬ 
soning  in  such  situations.  In  this  paper  we  take  a  careful  look  a':  sp<>r'+-'c  scenario 

that  involves  reasoning  about  perceptual  and  imaginal  events.  T  ’ii;>  called  per- 
ceptual  reasoning.  It  is  argued  that  perceptual  reasoning  can  be  characterized  in 
terms  of  experientially  acquired  perceptual  inference  rules.  Then  we  briefly  de¬ 
scribe  some  properties  of  perceptual  representation  and  architecture.  It  is  suggested 
that  the  specialized  (to  the  visual  modality)  nature  of  perceptual  architecture  and 
modality-specific  operations  on  perceptual  representations  that  it  provides  together 
can  account  for  mental  imagery  and  internal  visualizations  during  perceptual  rea¬ 
soning.  What  is  outlined  here  are  the  beginnings  of  a  theory  of  commonsense  visual 
reasoning  based  on  the  twin  ideas  of  modality-specific  representations  and  inference 
rules  with  perceptual  and  conceptual  content  (6).  Finally,  the  implications  of  our 
stance  are  discussed. 


4.2  Perceptual  Reasoning 

4.2.1  A  Scenario 

Consider  the  following  situation.  You  are  seated  with  others  around  a  table  and 
you  notice  that  someone  sitting  beside  you  is  about  to  throw  a  rock.  You  then 
notice  that  the  rock,  if  thrown,  will  hit  a  glass-paned  window  outside  which  a  child 
's  playing.  Your  immediate  reaction  will  be  to  restrain  the  potential  rock  thrower, 
in  order  to  prevent  the  child  from  being  hurt. 
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Chapter  4.  Perceptual  Representation  and  Reasoning 
4.2.2  Analysis 

Let  us  now  analyze  the  reasoning  process  behind  this  prediction  that  the  child 
is  likely  to  get  hurt.  This  inference  may  be  seen  to  be  a  direct  consequence  of  an¬ 
other  inference;  that  the  rock  will  shatter  the  glass  pane  resulting  in  an  outwardly 
spreading  spray  of  glass  shards  that  will  hit  the  child.  This  inference  is  preceded  by 
yet  another  inference  about  the  possible  trajectory  of  the  rock.  The  first  inference 
in  this  chain,  namely  that  the  flying  rock  will  hit  the  glass  pane  of  the  window,  is 
derived  by  the  visual  system  using  perceptual  and  motor  operations  (in  this  case 
scanning,  that  may  involve  a  mere  shift  of  attention,  eye  movements,  or  even  turning 
the  head  depending  on  the  distances  involved,  would  provide  the  needed  geometric 
information)  in  response  to  the  internal  goal  of  predicting  the  possible  trajectory 
of  the  rock.  The  second  inference,  that  the  glass  will  shatter  resulting  in  an  out¬ 
wardly  spreading  shower  of  shards,  is  more  interesting.  It  is  clearly  not  derivable 
from  the  environment  since  it  has  not  happened  yet.  However,  we  possess  experien- 
tially  acquired  knowledge  that  glass  panes  shatter  when  hit  by  flying  objects.  While 
this  “chunk”  of  knowledge  is  verbalizable  as  above,  it  consists  of  more  than  what 
this  verbalization  expresses.  This  is  evident  from  the  fact  that  we  are  capable  of 
distinguishing  situations  that  “look”  alike.  For  instance,  we  do  not  always  predict 
shattering  when  seeing  or  thinking  about  a  flying  object  about  to  hit  a  transparent 
pane.  If  we  also  know  that  the  pane  is  made  of  non-breakable  plastic  or  that  the 
object  is  made  of  soft  rubber,  we  will  not  predict  that  the  pane  will  shatter.  Note 
that  the  relevant  knowledge  about  the  material  properties  of  the  pane  and  the  object 
is  conceptual  (non-visual)  in  nature.  Thus  the  knowledge  that  is  brought  to  bear  for 
making  a  prediction  has  a  conceptual  component.  It  has  a  perceptual  component  as 
well,  which  is  what  facilitated  the  internal  visualization  of  the  rock  hitting  the  pane, 
the  pane  shattering,  and  the  shards  flying  outwards  in  the  general  direction  of  the 
rock’s  trajectory.  This  visualization  utilizes  representations  supplied  by  perception 
and  imaginal  operations  on  them  provided  by  the  perceptual  architecture.  It  is  con¬ 
cluded  from  this  visualization  that  the  shards  may  hit  the  child.  At  this  point  the 
knowledge  that  a  child  hit  by  flying  objects  may  be  hurt  kicks  in  (this  knowledge 
may  be  purely  conceptual,  or  for  very  imaginative  people,  may  have  a  perceptual 
component  that  allows  :hem  to  visualize  glass  shards  penetrating  the  skin),  gener¬ 
ating  the  inference  that  the  particular  child  currently  playing  outside  the  window  is 
likely  to  be  hurt.  This  of  coarse,  is  the  motivation  beUnd  the  restraining  act. 

This  scenario  brings  out  some  very  interesting  aspects  of  commonsense  reason¬ 
ing.  One  is  that  quantitative  informatioa,  such  as  velocity  of  the  rock  on  impact, 
does  not  seem  to  have  been  used  by  the  reasoning  process.  Therefore  this  type  of 
reasoning  may  be  classified  as  qualitative  reasoning.  Other  interesting  aspects  be¬ 
come  evident  when  one  considers  what  kinds  of  mental  or  physical  operations  were 
employed  in  order  to  make  the  three  inferences  and  to  which  entities  these  opera¬ 
tions  were  applied.  The  first  inference  about  the  rock  hitting  the  window  paM  was 
made  by  scanning  the  scene  along  a  predicted  trajectory.  Thus,  this  reasoning  in¬ 
volved  perceptual  and  motor  operations  applied  to  the  environment  (eye  movement 
for  scanning)  as  w^  as  computations  on  an  internal  representation  of  the  envirtm- 
ment  (predicting  a  trajectory).  The  second  inforeaoe  about  the  efiects  of  a  coflisfon 
between  the  rock  and  the  window  pane  was  done  by  an  internal  visuafisatioa  gnided 
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by  OUT  experiential  knowledge.  Thus,  this  reasoning  involved  imaf^al  operations 
applied  to  an  internal  representation  of  the  environment.  The  third  inference  abont 
the  child  being  hurt  was  made  by  applying  conceptual  knowledge  to  information 
derived  inwi  the  visualization. 

It  is  dear  that  reasoning  in  this  scenario  is  not  merely  a  process  of  manipulat¬ 
ing  sentential  or  propositional  knowledge.  Rather,  it  is  a  process  of  goal-directed 
inferences  made  from  the  environment  and  its  internal  representation  perceptually 
and  imaginaDy.  Generating  predictions  during  such  reasoning  is  mediated  by  ex¬ 
periential  knowledge  that  we  have  about  events  in  the  physical  world.  This  is  the 
phenomenon  that  we  call  perceptual  reasoning. 

4.2.3  Perceptual  Rules 

An  interesting  question  is  how  event  predictions,  which  are  experienced  as  vivid 
internal  visualizations,  are  made  during  perceptual  reasoning.  We  postulate  that 
these  predictions  are  driven  by  perceptual  inference  rules  that  are  acquired  from 
the  experience  of  interacting  with  the  physical  world.  A  perceptual  rule  is  a  piece 
of  inferential  knowledge  whose  antecedent  and  consequent  have  perceptual  compo¬ 
nents.  These  components  are  abstract  representations  of  perceptual  events,  which 
can  be  matched  to  a  large  number  of  particular  situations.  In  other  words,  a  per¬ 
ceptual  rule  is  essentiaUy  a  learned  association  between  two  perceptual  events  in  a 
sufficiently  abstract  form  that  it  will  match  a  class  of  particular  perceptual  events. 
Such  perceptual  rules  may  often  be  verbalized,  but  such  verbalizations  typically 
close  some  of  the  information  contained  in  the  oripnal  perceptual  version  of  the 
rules.  For  example,  the  content  of  a  perceptual  rule  relevant  to  the  aforementioned 
glass  shattering  scenario  may  be  verbalized  as  “if  a  flying  object  hits  a  glass  pane, 
the  pane  is  likely  to  shatter,”  but  the  corresponding  internal  representation  has  per¬ 
ceptual  components  that  preserve  spatial  properties  which  can  be  used  directly  for 
prediction.  The  acquisition  of  particular  antecedent-consequent  perceptual  event 
associations  and  their  generalization  into  abstract  perceptual  rules  are  the  result  of 
learning  from  experience. 

A  rule  must  also  have  conceptual  (non-visual)  conditions  that  allow  discrimina¬ 
tion  among  perceptually  similar  situations,  to  some  of  which  the  rule  is  applicable 
while  to  others  it  is  not.  For  example,  one  conceptual  condition  of  the  rule  on  shat¬ 
tering  could  be  that  the  object  be  hard  and  heavy.  While  perceiving  or  visualizing 
a  collision  between  an  object  and  a  transparent  pane,  determining  whether  the  ob¬ 
ject  is  indeed  hard  and  heavy  requires  extraneous  (to  perception)  knowledge.  We 
may  know  that  rocks  are  generally  hard  and  heavy  and  that  the  flying  object  in  the 
current  situation  is  a  rock,  and  thereby  conclude  that  this  conceptual  condition  is 
indeed  satisfied.  Thus,  the  perceptual  and  conceptual  components  together  serve  to 
determine  the  rule’s  applicability  to  any  particular  situation. 

The  antecedent  of  a  perceptual  rule  contains  abstract  specification  of  a  percep¬ 
tual  event  that  wiU  match  with  a  class  of  particular  perceived  or  imagined  events. 
For  instance,  the  abstract  specification  of  the  ccdlision  event  in  the  antecedent  of 
the  shattering  rule  would  match  with  a  variety  of  perceived  or  ima^pned  cdlisions 
between  different  objects  and  i^au  panes  of  different  shapes  and  orientations.  This 
raises  the  question  of  how  a  perceptual  event  is  specified  abstractly.  With  per- 
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ceptual  representations  that  are  hi«'arclucaUy  structured  with  levels  of  increasing 
detail  (such  as  the  object-centered  representations  of  Marr  and  Nishihara  [12]  this 
is  possible  since  the  antecedent  specification  can  be  such  that  it  will  match  the  top 
level(s)  of  perceptual  representations  of  events  belon^g  to  a  particular  class  (e.g., 
the  class  of  collisions  between  objects  and  panes).  This  match  need  not  be  affected 
by  details  of  shapes  and  orientations  of  objects  and  panes  involved  since  these  will 
be  represented  at  lower  levels  of  the  representation  hierarchy.  The  consequent  of  a 
rule  contains  an  operational  specification  of  a  predicted  event,  which  the  perceptual 
architecture  uses  to  modify  the  particular  representation  that  matched  the  rule’s 
antecedent  so  as  to  reflect  the  predicted  event’s  occurrence.  This  modification  of 
the  perceptual  representation  drives  the  internal  visualization  of  the  predicted  event. 
Thus  we  are  able  to  visualize  the  glass  pane  shatter,  for  example,  immediatdy  f<d- 
lowing  an  imagined  impact  of  a  rock  flying  through  air. 

Thus,  perceptual  rules  have  antecedents  consisting  of  abstract  specifications  of 
perceptual  events  which  can  match  with  a  variety  of  particular  perceptual  events  (ac¬ 
tually  perceptual  representations  of  events  ddivered  by  perception),  and  conceptual 
conditions  that  provide  a  finer  discrimination  capability.  Their  consequents  contain 
plausible  predictions.  Such  rules  facifitate  not  only  the  making  of  predictions,  but 
also  the  visualization  of  these  predictions.  How  are  such  rules  used?  Particular 
perceptual  events,  whether  witnessed  or  imagined,  that  match  the  abstract  specifi¬ 
cations  of  a  rule  will  activate  it.  If  the  corresponding  conceptual  conditions  are  also 
met,  the  rule  is  applied,  resulting  in  the  generation  of  an  inference  (prediction)  and 
the  modification  of  the  perceptual  representation  of  the  triggering  event  to  reflect 
the  effect  of  the  predicted  event.  This  enables  one  to  visualize  the  predicted  event. 
All  this  takes  place  within  the  perceptual  architecture. 

This  conception  of  perceptual  reasoning  is  somewhat  different  from  the  tradi¬ 
tional  view  of  reasoning  and  deliberation.  It  was  partly  motivated  by  viewing  per¬ 
ception  and  imagery  from  the  artificial  intelligence  perspective  of  reasoning.  The 
mm  of  all  this  theorizing  has  been  to  put  forth  one  explanation  of  a  kind  of  reasoning 
that  humans  engage  in  routinely.  This  sort  of  reasoning  has  not  hitherto  received 
much  attention  in  artificial  intelligence  or  cognitive  psychirfogy. 

4.2.4  Perceptual  Representations 

In  the  foregoing  description  we  referred  more  than  once  to  hierarchically  struc¬ 
tured  perceptual  representations  and  a  perceptual  architecture.  Therefore,  a  dis¬ 
cussion  of  what  we  mean  by  these  terms  is  in  order.  The  repieeentatioB  of  visual 
information  and  its  rdation  to  the  phenomenon  of  mental  imagery  have  attracted  a 
great  deal  of  attmitkm  from  cognitive  paychcdogists.  There  has  beu  ooaaderahk  de¬ 
bate  (1, 10, 15, 20]  about  postulating  analog  representations  for  imagery  as  opposed 
to  uniform  propositional  representations.  While  all  contentions  isenas  regarding  im¬ 
agery  may  not  yet  be  folly  resolved,  some  properties  ot  perceptual  representations 
can  be  gleaned  from  the  mnpuical  and  philosophical  hteratnre  on  this  topk.  Four 
such  pn^ierties  are  the  following. 

(1)  Perceptual  representations  must  be  compositkmal,  since  we  can  fnmpnee  ^ 
fsrent  (even  previously  unseen)  mental  images  quite  eatily  and  r^pldy.  Bfoder 
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maa’f  [4]  tlMoiy  of  recognition  by  compmients  raggeeti  that  rack  r^resenUr 
ticme  mnst  be  componential,  and  bence  coinpo«tk»aL 

(2)  Perceptnal  representations  mnst  contain  infcmnation  about  the  represented  ob> 
ject  at  different  levds  of  detail  since  we  are  able  to  ima^e  a  previously  seen 
object  at  different  resdutions.  This  su^ests  that  the  r^resentatkms  may  be 
hierarchically  structured  unth  levds  of  increadng  detail,  along  the  lines  sug¬ 
gested  by  Marr  and  Nishihara  [12]. 

(3)  Perceptual  representations  must  support  an  ‘internal  depiction”  of  the  repre¬ 
sented  object  since  a  mental  image  is  an  experience  of  such  a  depiction.  In¬ 
deed,  researchers  have  postulated  mechanisms  such  as  the  visual  buffer  [10]  and 
symbol-filled  arrays  [20]  to  account  for  this  depictive  pnqterty. 

(4)  Perceptnal  representations  must  also  have  a  descriptive  component  encoding 
structural  information,  as  Hinton  [7]  points  out. 

We  have  previously  argued  [5]  that  hierarchically  structured  descriptive  representa¬ 
tions,  when  loaded  from  long  term  memory  into  a  specialized  (to  the  visual  modality) 
architecture  that  provides  modality-specific  opmtions  on  these  representations,  can 
give  rise  to  the  experience  of  mentd  imagery.  Furthermore,  we  believe  that  these 
perceptual  representations  are  similar,  in  a  sense,  to  the  discrete  symbdic  represen¬ 
tations  used  by  computers.  A  discrete  symbcdic  representation  consists  of  structures 
of  symbols  composed  according  to  well-defined  rules  of  formation.  A  symlxd  is  just 
a  token,  and  by  itself  devmd  of  any  meaning.  Therefore  the  distinctive  property  of 
such  a  representation  is  that  its  semantics  derives  from  its  interpretation,  by  means 
of  operators  provided  on  it,  by  the  architecture  of  the  system  in  which  it  resides. 
We  suggest  that  perceptual  representations  are  similar  to  discrete  symb<dic  repre¬ 
sentations  in  that  they  also  have  this  property.  In  other  words,  the  experience  of 
mental  imagery  and  of  applying  operations  like  scanning  on  mental  images  arise  not 
from  any  inherent  analog  property  of  the  representations  themselves,  but  from  an 
interpretation  of  perceptual  representations  by  a  modality-specific  architecture. 

This  characterization  of  perceptual  representations  is  different  from  an  analog  or 
propositional  characterization.  On  one  hand,  unlike  analog  representations,  these 
are  not  depictive,  but  are  descriptive.  On  the  other  hand,  unlike  propositional  rep¬ 
resentations,  their  interpretation  is  not  based  on  some  universal  truth  semantics, 
but  is  dependent  on  the  architecture  in  which  they  reside.  Since  in  this  view  the 
meaning  of  a  pcrc^tual  representation  derives  from  its  interpretation  by  the  nn- 
dwlying  architecture  and  processes  that  operate  upon  it,  properties  exhibited  by 
such  a  representatiM  arc  not  intrinsic  to  the  representation  itself,  but  stem  frmn 
the  modality-spedfic  architecture  that  supports  it  and  processes  that  operate  on 
it.  This  feature  provides  a  way  to  explain  how  non-analog  representations  can,  in 
princi|de,  give  rise  to  the  experience  imagery  by  virtue  of  being  interpreted  by  an 
underlying  nK>dality-^ecific  architecture. 

Perceptual  representatioiu  are  hierarchical  structures  comprisiag  descriptws  of 
visual  attributes  such  as  shape,  color,  texture,  etc.,  and  of  spatial  reiatiotta  among 
elements.  The  hierarchical  structure  relects  diffannit  kveb  of  detafl  at  which  a 
perceived  scene  is  encoded.  The  depth  of  the  hierarchy  and  resolutiott  of  description 
at  different  lev^  dqtend  upon  aqwets  (such  as  objects  in  the  scene  that  were 
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attended  to,  how  closely  they  were  looked  at,  etc.)  of  the  act  of  perception  which 
produced  the  perceptual  representation. 

A  collection  of  descriptors  in  a  perceptual  representation  that  together  corre¬ 
spond  to  a  distinct  element  of  a  scene  (such  as  an  object  or  part  of  an  object)  forms 
a  percept.  A  perceptual  representation  is  made  up  of  multiple  percepts.  A  percept 
is  a  basic  unit  or  building  block  of  perceptual  representations.  It  describes  all  visu¬ 
ally  perceived  information  about  a  distinct  element  of  an  image.  Each  percept  has 
a  corresponding  mental  image  that  results  from  its  interpretation.  This  image  in 
the  mind’s  eye  is  the  depictive  counterpart  of  the  percept,  which  is  descriptive  in 
nature.  The  description  and  the  depiction  are  two  sides  of  the  same  coin.  Thus  a 
perceptual  representation  consisting  of  percepts  is  both  an  internal  description  of  a 
perceived  scene  and  a  recipe  for  the  composition  of  a  corresponding  mental  image 
through  interpretation.  Percepts  provide  compositionality,  i.e.,  allow  one  to  com¬ 
pose  perceptual  representations  (and  corresponding  mental  images)  from  percepts 
that  are  parts  of  representations  of  different  images. 

A  mental  image  results  from  accessing  relevant  percepts  in  memory,  bringing 
these  into  the  perceptual  architecture,  and  composing  these  appropriately.  This 
visual-modality  specific  architecture  provides  imaginal  operations  on  the  represen¬ 
tation.  Just  as  interpretation  generates  an  experience  of  mental  imagery,  the  invoca¬ 
tion  of  an  imaginal  operation  (e.g.,  scanning)  concomitantly  creates  the  correspond¬ 
ing  experience  (e.g.,  that  of  scanning  an  image  with  the  mind’s  eye).  The  imaginal 
operations  that  the  perceptual  architecture  provides  on  perceptual  representations 
are  similar  to  the  operations  that  the  visual  system  employs  under  perceptual  con¬ 
ditions,  but  they  may  not  be  identical  [17]. 

Note  that  though  the  analogy  of  discrete  symbolic  representations  interpreted 
by  an  underlying  computational  architecture  was  used  to  explain  perceptual  repre¬ 
sentations,  these  need  not  necessarily  be  symbolic  in  nature.  These  may  instead  be 
realized  as  patterns  of  weights  or  strengths  in  a  neural  substrate.  The  preservation 
of  aforementioned  properties  is  what  is  important.  Our  proposal  is  also  neutral  with 
respect  to  the  postulation  of  an  analog  medium  for  mental  images  and  internal  visu¬ 
alizations.  It  may  very  well  be  that  interpretation  of  the  perceptual  representation 
of  a  scene  results  in  the  creation  of  a  surface  display  in  a  visual  buffer  as  Kosslyn 
[10]  suggests  or  the  creation  of  a  symbol-filled  array  as  Tye  [20]  suggests.  On  the 
other  hand  it  may  be  the  case  that  this  interpretation  merdy  produces  the  same 
pattern  of  neural  activation  as  that  created  by  the  topogr^hic  projection  of  the 
retinal  image  into  the  visual  cortex  daring  perception,  thus  creating  the  experioicc 
of  mental  imagery.  The  main  point  here  is  that  representations  that  are  neitlMr 
analog  nor  propositkmal  and  an  underlying  perceptual  architecture  that  provides 
operations  specific  to  the  visual  modaffty  can  possibly  explain  the  phenomoran  of 
imagery  as  wdl  as  support  pmrceptual  reasoning. 

4.3  Discussion 

In  this  paper  a  phenomeaoa  termed  percq»taal  reasoning  was  iHnstoated  with 
an  eumple  and  exptainad  in  tersM  of  perc^nal  laftrenoe  nks.  SnbeofMat|y, 
we  described  properties  of  perceptual  repressnfarions  and  arddleetun  tiM  any 
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nadcrlie  viinal  perception  and  imagery  aa  wdl  aa  rapport  peroq>tiia]  reaaoning.  For 
the  m<»nent  at  least,  the  strength  of  ideas  in  this  paper  lies  in  the  richness  of  their 
implications  rather  than  on  the  extent  of  their  empirical  rapport. 

One  point  of  this  p^>^,  th^efme,  is  to  suggest  that  it  is  worth  conducting  ex¬ 
periments  to  gather  (supporting  or  opposing)  direct  evidence.  There  is  some  indirect 
evidence  avrilable.  Yates  and  ctdleagues  (21]  report  an  experiment  which  showed 
that  individuals  sdving  motion  prediction  problems,  such  as  predicting  the  path  of 
a  ban  rdeased  from  the  end  of  a  rotating  ding,  rabjectivdy  experienced  reported  so¬ 
lutions  as  the  result  of  a  mental  enactment  of  the  problmn  dtnation.  Furthermore, 
they  suggest  that  the  source  of  this  enactmrat  speared  to  be  a  number  of  rela- 
tivdy  unsystematic,  mutuaUy  incondstent,  and  dtuation-spedfic  prototypes  based 
on  experirace,  which  capture  typical  aspects  of  motion.  A  precise  characterization 
of  perceptual  inference  rules  in  terms  of  content  and  organization,  if  indeed  these 
are  psychcdogicany  real,  requires  much  empirical  research  of  the  sort  undertaken 
by  Yates  and  colleagues.  The  phenomenon  of  **perceptnal  fluency”  observed  by 
Jacoby  and  Dallas  [8]  is  also  relevant.  They  found  that  fdlowing  a  24-honr  delay 
after  subjects  studied  a  word-list,  their  conscious  recognition  of  these  words  was  at 
near-chance  levels.  But  the  subjects  were  twice  as  likely  to  recognize  these  words, 
compared  to  control  words,  in  a  tachistoscopic-recognltion  pairadigm.  It  is  argued 
that  prior  exposure  leads  to  “perceptual  fluency”.  On  a  dmilar  vein,  Proffitt  and 
Keiser  [14]  found  that  subjects  were  very  good  at  rejecting  anomalous  motions  when 
shown  videotapes  of  actual  and  simulated  motions.  If  prior  exposure  can  lead  to 
perceptual  fluency  in  recognition  and  classification,  is  it  possible  that  it  can  also  lead 
to  perceptual  fluency  in  reasoning?  Koedinger  and  Anderson  [9]  show  that  expert 
problem  solving  behavior  in  geometry  can  be  modeled  in  terms  of  the  instantiation 
of  schemas  which  organize  problem  solving  knowledge  around  prototypical  images. 
The  idea  of  perceptual  inference  rules  in  which  perceptual  events  cue  predictions 
is  not  too  far  removed  from  this.  Anderson  [2]  has  pointed  out  that  production 
systems  consisting  of  rules  with  antecedents  and  consequents  are  not  incompatible 
with  imagery  processes. 

What  about  the  implications  of  this  proposal?  One  implication  is  that  the 
true  nature  of  representations  underlying  perception  and  imagery  may  be  closer 
to  a  middle  ground  between  the  extreme  positions  within  the  propositional  and 
analog  schools  of  thought.  More  relevant  to  the  focus  of  this  p^>er,  however,  is 
the  notion  that  reasoning  may  be  situated  or  grounded  in  perception.  Another 
implication,  therefore,  is  to  suggest  that  much  of  commonsense  reasoning  is  tightly 
coupled  with  perception  and  that  perceptual  representations,  not  just  conceptu^ 
knowledge,  play  a  crudal  part  in  the  reasoning  process.  Just  as  the  paradigm 
of  case-based  reaaomng  [18]  provided  an  impetus  for  computational  investigations 
of  memory-based  reasoning,  our  hope  is  that  proposals  like  this  will  provide  an 
impetus  for  computational  studies  of  perception-based  reaa<ming.  In  fact,  there 
is  already  a  move  toward  Intimating  the  formal  study  of  the  use  of  diagrams  and 
pictures  in  reasoning  [3].  Fbrthermme,  studies  of  human  reasoning  also  ought  to  pay 
closer  attmitkm  to  reascnimg  prooemas  that  are  dnsely  cou^sd  with  perc^tioB  and 
perceptual  lepresenUtlou.  A  broader  impUcatioa  is  that  nirnning  aray  he  co^*^ 
with  perceptiM  in  modafities  other  than  visioa  as  weO.  We  sivPMt  a  previoaiiy 
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expressed  view  [11,  17]  that  research  oa  reasoning  in  artificial  intelligence  should 
explore  connections  to  vision  and  imagery. 

Our  current  research  is  on  developing  a  computational  model  of  perceptual  rep¬ 
resentation  and  reasoning  in  the  domain  of  spatially  interacting  objects  depicted 
in  diagrams.  Limited  space  precludes  a  detailed  discussion  here  (see  [13]  for  an 
overview).  However,  two  aspects  of  this  computer  system  that  directly  relate  to 
the  focus  of  this  paper  and  therefore  worthy  of  mention  are  the  computational 
realizations  of  perceptual  representation  and  inference  rules.  The  computer  rep¬ 
resentation  of  an  object  configuration  diagram  consists  of  a  hierarchy  of  symbolic 
descriptors  and  bitmap-based  depictions,  both  of  which  are  accessible  to  reasoning 
procedures.  This  representation  is  compositional,  hierarchical,  partly  depictive,  and 
partly  descriptive.  Representational  devices  called  visual  cases  are  used  to  encode 
the  predictive  knowledge  of  perceptual  inference  rules.  A  visual  case  consbts  of 
visual  cues,  conceptual  conditions,  and  predicted  events.  We  have  developed  a  suite 
of  over  sixty  cases  for  the  current  domain  (see  hg.  4.1  for  an  example).  Given  a 
prediction  problem  that  consists  of  a  multi-object  configuration  diagram  and  the 
initial  motion  of  one  object,  the  system  will  go  through  cycles  of  visual  analysis 
and  deliberation,  and  predict  the  temporal  evolution  of  the  configuration.  Visual 
cases  come  into  play  during  deliberation.  Visual  cues  are  used  for  matching  cases  to 
the  current  configuration,  conceptual  conditions  provide  additional  discrimination 
capabiUty,  and  predictions  of  cases  found  to  be  applicable  drive  the  next  cycle  of 
reasoning.  It  is  worth  noting  here  that  visual  cues  make  visual  case  matching  similar 
to  the  instantiation  of  diagram  configurations  [9],  and  conceptual  conditions  provide 
a  means  for  the  computational  modeling  of  cognitive  penetrability  [16]. 
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Abstract 

This  paper  is  concerned  with  how  diagrams  can 
be  used  for  reasoning  about  spaSal  interactions 
of  objects.  We  describe  a  computational 
approach  that  emulates  the  human  capAMty  of 
predicting  interactions  of  simple  objects  depicted 
In  two  dimensional  diagrams.  Throe  ooro  aspects 
of  this  approach  are  a  visual  representation 
scheme  that  has  symbolic  and  ImagM  parts,  the 
use  of  visual  processes  to  manipulate  the 
imaginal  part  and  to  extract  spatial  informabon, 
and  visual  cases  that  encode  experiential 
knowledge  and  play  a  central  role  in  the 
generation  of  spatial  Inferences.  These  aspects 
are  described  and  the  approach  Is  Mustrated  wHh 
an  example.  Then  we  show  that  reasoning  with 
Images  is  an  emerging  and  promising  area  of 
Investigation  by  discussing  oomputationai  and 
cognitive  research  on  imagery. 

1  Introduction 

Humans  quite  often  make  use  of  spatial  information 
impNdt  in  diagrams  to  make  Inferencee.  For  example, 
anyone  famMar  wiOi  the  operaion  of  geara  wV  be  able 
to  solve  tfie  problem  posM  in  Figure  1  by  imagining 
the  rotary  motion  of  gwi  being  transmitted  to  Ihe  rod 
through  gear2.  reauMng  in  Sie  horliontal  translation  of 
the  rod  unbl  it  hlls  ttie  waR.  m  such  sNuabons  humans 
reason  about  spaM  bitoraetlons  not  only  by  usirtg 
conceptual  knowfadga,  but  also  by  extracting 
constraints  on  such  mtaradlens  from  a  perceived 
image.  This  Magralad  use  of  visual  knoida^  (about 
spatltf  conf^ratlons)  from  the  diagram  and 
oonoppiiMi  KnowiMgv  (nicn  wm  uw  ifyiURy  or  piwcRy 
of  objacts  involvad)  is  a  vary  intarasting 
phanomanon.  in  this  paper  wa  iltuatrata  a 
computatioftal  approach  0iai  amuMaa  this  capabWly 
for  solving  simpla  moflon  pradhstlon  proMama. 


2.1  Metlen  Prediction  Problema 


The  dess  of  problems  we  address  is  the  following: 
given  a  two  dimensional  diagram  of  the  spatial 
configurallon  of  a  sat  of  objects,  one  or  more  initial 
motions  of  obiacts  and  relevant  conceptual 
information  about  them,  prediet  the  subsequent 
dynamics  of  the  configuration.  Figure  2  shows  a 
typicai  exampia, 

2.2  Cagnilfva  inapiratioa 

Thera  la  oonaMarabla  avWanos  In  cognitive 
sdanea  for  tha  use  of  mental  bnagas  by  people  when 
solving  problama  [Kossiyn,  iPtl].  Furthermore, 
imrospadva  rapena  of  paopia  w«ian  given  a  diagram 
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like  Figures  1  or  2  and  asked  to  predict  motions 
indicated  that  by  iooking  at  the  diagram  they  were  able 
to  visualize  the  motion  of  one  object  causing  that  of 
another  through  physicai  contact.  They  appeared  to 
be  using  (the  image  of)  the  diagram  in  front  of  them 
directly  to  simulate  motions  in  their  minds.  These 
report  indicated  the  following. 

( 1 )  Given  a  diagram  depicting  the  problem,  humans 
quite  rapidly  focus  on  localities  of  potential 
interactions. 

(2)  People  also  seem  to  simulate  or  project  the  motion 
to  determine  the  nature  of  interactions  that  wil  occur. 

(3)  For  reasoning  about  the  dynamics  (e.g.,  how  wii 
motion  be  transmitted  after  a  coWsion?)  humans  bring 
conceptual  knowledge  (e.g.,  gears  are  rigid  objects) 
and  experientiai  knowiedge  (e.g..  if  an  object  collides 
with  another,  it  typicaHy  transmits  motion  in  the  same 
direction)  to  bear  on  the  problem. 

We  have  developed  an  approach  that  emulates  these 
capcMMies. 

2.3  Representation 

The  specification  of  a  motion  prediction  problem 
consists  of  a  scene  depicting  the  spatial  configuration 
of  the  objects  involved  and  conceptual  information 
about  their  properties  (see  Figure  2).  The  spatial 
configuration  is  represented  using  a  ‘visual 
representation*  whHe  conceptu^  information  about 
object  properties  is  represented  dedarativeiy  and 
linked  with  conrespondi^  object  descrtptlona  in  the 
visual  representation.  In  our  computer  model  the 
visual  representation  of  a  problem  specification  is 
interactively  constructed  prior  to  problem  solving, 
whereas  in  the  case  of  humans  perceptual  processes 
deliver  such  representations. 

Mental  representation  of  visual  information  and  Its 
relation  to  the  phenomenon  of  mental  Imagery  have 
been  the  fOd  of  considerable  rssaarch  in  oognitive 
science  (Blederman.  1990;  Pinke,  1989;  Kosslyn, 
1961;  Pytyshyn,  1961].  A  central  isaue  here  is  the 
question  of  hew  mental  Imagery  dial  appsars  to  be 
analogie  in  nMure  can  artaa  from  undertying 
representations  that  are  considered  to  be 
propositional.  One  hypothesis  regaidlng  thia  issue 
(Chandrasekaran  and  Narayanan.  19^  is  that 
representations  for  dMerera  sensory  modatiltes  are 
op#rMM  upon  oy  imofproion  mm  proviM  pnvNogoQ 
opofvOono  spocnio  lo  vim  moom^^a  vicmovip  oviovip 
the  symbola  in  ttie  representation  to  perceptual 
pmnmvw  vi  mv  oon^^^^m^Boip  wn^^vy  ^winmn*  •  *VpV 
our  oniafof  m  010  vdovo  quNvon  m  umi  vyrnooiio 
reprasentaHona  of  visual  Mormatlon  are  intorpreted 
bv  machanlMn  that  ara  aaaelaaaad  la  the  visual 

ley  if f^iee taweeweisw  laa^^a  ^as^e  •ap^pieeaaweBwWi  w  ler^p  Tvwe^^ap 

mooMy  mo  vwcn  pfovioo  opormmnv  mvocM  lO  mm 


modality.  These  operations  construct  mental  images 
using  perosptual  primitives  in  the  visual  domain,  visud 
representations  in  our  computer  model  are  therefore 
strtjctured  as  multi-level  hierarchies  that  contain 
«meg^  descriptions  and  symboMc  descriptions  of  the 
object  cortfiguration.  Each  level  of  the  hierarchy 
contakts  a  symbolic  description  and  an  imaginai 
descr^rtion  of  the  configuration  at  a  certain  resolution. 
The  symbolic  description  is  built  from  parametrized 
shape  primitives  Ifite  drdes.  rectangles,  etc.  whereas 
the  imaginai  description  is  a  two  dimensional  pixel 
array  of  fixed  wkflh  and  height  in  which  a  configuration 
is  depictad  by  object  boundaries  and  is  implemented 
as  a  bitmap.  In  the  rest  of  the  paper  we  wil  use  the 
term  ‘diagram*  to  refer  to  this  boundary-based 
rendering  of  the  object  configuration.  Thus  the  visual 
representation  is  dual  (symbolic  and  imaginai)  in 
nature.  The  two  types  of  mental  representations 
(surface  images  and  deep  encodings)  that  Kosslyn 
{1981]  proposea  reflect  a  simlar  dually. 

The  most  interesting  property  of  this 
representation  la  that  it  simuitaneousiy  provides 
abstract  symbolic  descriptions  of  an  object 
configuration  and  dksctiy  captures,  in  the  imaginai 
descriptions,  spedlle  spstiial  information  about  the 
object  configuration  (the  extern  of  contact  between 
two  surfaces,  for  exampla).  The  justification  for  our 
decision  to  structure  the  symbolic  descriptions  in 
tenns  of  parametrized  shape  pilmilives  stems  from 
shape  ropresantation  theoriae  that  utflizs  primitives 
like  geons  (Bfodarman.  199<q  and  generaflzad  cones 
[Marr  and  NIshihara.  1976].  Multiple  levels  of 
description  are  provided  In  tite  represernation  to  alow 
visual  processes  to  operate  at  diflerent  levels  of 
resolution. 

2.4  Reasoning 

TTte  basic  model  of  reasoning  Is  as  follows.  The 
system  goes  through  a  sequence  of  deliberative 
states.  ITiie  sequence  corresponds  to  fits  changes 
msi  m#  vwi  oo|VGi  coivijufVDOii  unovcgoM  ouv  to 
motion  and  Interaction  of  objects.  Each  deflberaiive 
Stale  repreeerm  a  particular  cortfiguration  that  the 
oofvcio  wiunio  fli  toifio  povn  ounnQ  vw  vvomoon  oi 
Donwior •  Wfifli  uwvnymnvo  m  OMiofwv  siho  irofn 
other  states  Is  that  it  represents  a  configuration  in 
which  an  imoraction  (such  as  eoMalon)  has  occurred 
that  wM  change  tits  subsequent  behavior  of  objects. 
The  term  defiberative  refers  to  the  noMsslty  of 
‘delfoeratien*  that  arises  at  tiieee  staiaa  in  order  to 
predtet  future  behavior  of  objects.  A  significant 
charaeioflatie  of  His  dafiberation  is  tite  oomblned  use 
of  perceptual  infermatien  from  tite  dfogram  and 
knowledM  relevsnl  to  tite  sHuation. 

weM ve^veF^a^w  fm aiaeee^Weafl^F  i^w^F”dw“  me  wem  meem^e^mme* 
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The  transition  between  deilberative  states  Is 
accomplished  by  two  groups  of  processes.  In  one. 
purely  visual  operations  such  as  attention-focussing, 
scanning,  boundary-tracMng.  and  corttact-detection 
are  used  to  identify  significant  aspects  of  the  current 
object  configuration  from  the  diagram  (e.g.,  locating 
interesting  regions.  Identifying  surfaces  of  poterttlai 
interaction,  etc.),  to  reason  about  how  the 
configuration  will  evolve  (e.g.,  project  surface 
motions),  and  to  detect  the  next  deHberative  stale.  A 
deliberative  state  Is  detected  by  watching  out  for 
certain  events  as  the  configuration  depicted  in  the 
diagram  changes.  The  sstabishment  of  a  now  contact 
between  objects,  the  elimination  of  a  previously 
existing  contact  the  estabHshment  of  a  new  support 
relationship  between  objects,  and  the  removal  of  a 
support  reMionship  are  some  examples  of  events 
which  indicate  a  delberaltve  state.  MoUvailons  behind 
and  justifications  for  these  prooeeses  derive  from  the 
extensive  merature  on  mental  imagery,  seme  of  which 
are  tfscussed  In  section  3,  and  the  worfr  of  Chapman 
[1990]  and  Ullman  (198^  on  visual  roudnes.  This 
group  of  processes  corresponds  to  the  humm 
COQnillVV  pfPOMS  Of  MfiOQOnnQ  • 

The  secoRj  group  of  prooaasaa  aocompNah  the 
aforementioned  task  of  dsIbsrHon.  Here  knowfedgo 
about  how  Intsrading  phyaloal  objects  tend  to  behave 
under  varloua  coruJitions  la  used  to  predict  the 
behavior  of  objects  foNosdng  ttie  cunent  deNberadve 
stale.  We  lake  a  specMe  position  on  ttie  form  In  which 
this  knowledge  io  avataMe  and  the  way  in  which  it  is 
uiMzed.  This  Is  deecrRwd  next  A  procaes  model  of 
visual  reasoning  Is  shown  m  FIguro  3. 

We  beleve  that  the  knosSedge  humans  bring  to 
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bear  on  making  spatial  inferences  in  similar  situations 
is  mostly  acquired  through  experience,  and  so  in  the 
computer  nxKlel  experiential  knowledge  has  been 
given  a  central  role  in  deciding  how  to  proceed  from  a 
deilberative  stale.  Experiential  memory  is  considered 
to  be  an  organized  md  indexed  collection  of  cases 
[Schank.  1982]  and  case-based  reasoning  is  a 
computatlonai  paradigm  for  modelling  the  role  of 
experience  In  problem  solving  [Koiodner  and 
Simpson,  1989].  Therefore,  representational 
sfructurss  caled  ^visual  cases*  have  been  developed 
to  encode  knowledge  appScabie  at  dsMberalive  states 
and  to  fadiilale  Inferandng.  Each  case  represents  a 
typical  spatial  event.  Since  cases  represent 
experiential  knowledge,  they  may  not  be  logically 
parsimonious  or  mutually  exclusive.  A  visual  case  has 
three  parts.  One  is  information  about  spatial 
configurallons  to  vdiich  the  case  is  appHcabie.  Cases 
are  caled  ’visual*  because  this  imonnation  is  visual  in 
nature  and  is  Ihe  *key*  by  which  relevani  cases  got 
selected  during  reasoning.  It  may  also  be  viewed  as 
an  •abstfacT  Image  that  depicts  the  sssential  aspects 
of  configurallons  to  which  the  case  Is  applicable. 
Because  of  this  abstractness  a  case  can  be  matched 
with  a  varlely  of  specillc  configurations.  This  property 
obviates  the  need  lor  a  largo  number  of  cases.  The 
second  part  Is  non-vlsual  informaSon  that  quailiea  the 
visual  part  further  and  It  is  used  for  deciding  the 
appNcabMiy  of  a  case  to  a  partcular  ailualion.  The  ttikd 
part  is  a  prodtetsd  event  aliacing  objacti  m  tte  spatial 
configuridlon  rspraeanted  by  the  case.  Thit  event 
may  spedly  a  state  change  (e.g..  a  dfracllenal  force 
bei^  sppBad  on  an  object),  a  continuous  change 
(e.g..  an  object  moving  In  a  particular  dfroction),  etc. 

Humans  are  sWRed  at  Mending  parcapkml  and 
concoptual  information  In  generating  spatial 
infWfioN*  I  o  Mwnn  wwo  nm  ooraw  jWb 
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prediction  about  the  motion  of  ob|ect2  after  ob|ect1 
collides  with  it,  given  the  problem  specification  of 
Rgure  2.  and  then  notice  how  this  pretftctlon  wiR 
change  R  the  specitications  ware  changed  to  Indicate 
(hatobiecti  is  nonwtgkf  (say,  made  of  rubber)  and  that 
ob|ect2  is  fixed  on  surfaces.  The  visual  and  non-visuai 
pa^  of  a  csoe  expRcMIy  capture  this  aspect  Thus  the 
Intent  of  visual  cases  is  to  rspresant  simple  chunks  of 
experiential  knowledge  about  spatfal  events  that 
humarts  typically  have,  and  to  model  the  Wending  of 
conceptual  and  perceptual  Information  in  makkig 
spatial  Inferences.  An  example  ol  typical  knowledge 
about  tpeiff*  events  is  *a  rloid  oWact  resdno  on  s  rkiid 

easwwrsew  ea  v^ea^^aw^e  v^^^aeMvse  vavi 

SeJ^^HCVf  OQiHOQO  Oy  S 

tend  to  side  in  the  same  dirsedon*.  TTw  schematic  of 
a  correspondbtg  visual  ossa  is  shewn  In  Figure  4. 

W  mraWQ  tfiQ  ippMQ  fO  prWOKt  w¥9fm9  mu  fOIIOW. 

The  retrieval  of  casea  relevant  to  the  spwial 
Gonllguraden  In  the  dtagram  Is  based  on  visual  cues. 

rrOm  WTIOng  Vi9  fvYIWM  C9N9t  ippSCHIV  OnM  Ww 
S9l9d90  Uy  iiWmwQ  VHOfTfUNfl  VOUI  00|9CI  pnip^nlM 

(which  Is  avadaWe  as  part  of  proMom  specticaden)  to 
verfly  the  nervvisual  parts  of  the  eases.  Events 

pfMiCIW  wf  um  vppVCW  G9M9  mW  nWwmf  pfUnM 

by  vorflying.dtieugh  visual  pfeesssss.tiatrfssBftRWy 


in  the  current  ob|ect  configuration.  The  remaining 
events  serve  to  guide  subsequent  steps  of 
reasoning.  Since  a  case  brings  conceptual  knowledge 
to  bear  on  visual  reasoning,  this  mechanism  of 
inference  may  be  viewed  as  a  computational 
roedzatlon  of  cognitive  penetrabWIy  or  the  influence  of 
tadt  knowledge  on  mental  Imagery  [Pytyshyn,  1981]. 

2.5  An  Example 

In  this  section  we  present  a  problem  solving  episode 
in  some  deiad.  The  specMcadon  of  the  problem,  vmich 
indudes  a  depiction  of  an  initiai  configuration  of 
objects,  an  NtM  motion  and  relevant  non-visual 
properties  of  the  objects,  is  shown  m  Figure  2.  The 
goal  Is  to  predici  si  resulting  motions  by  reasoning 
about  spadW  interactions  that  wW  occur  among  the 
objects,  m  our  compuier  modsi  control  of  reasoning  is 
done  by  a  procedure  that  generates  goals  and 
subgoals,  and  activates  relevant  processes  to 
achieve  them.  Thus  an  execution  trace  w«  appear  as 
a  tree  consledng  of  goals,  subgoals,  and  processes 
The  goal  generwion  fodows  the  process  model  m 
Figures. 

Figure  S  shews  a  partial  exscudon  trace  for  this 
example.  •Waeaon  sbeui  spailal  interactions*  >s  the 
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top  level  goal  and  it  has  four  subgoals  as  shown. 
Consider  the  first  subgoal  ’locate  interesting  spatl^ 
regions*.  There  is  a  set  of  heuristic  criteria  to  locate 
interesting  regions,  one  of  which  is  that  regions 
representing  touching  surfaces  of  multiple  objects 
are  interesting.  The  visual  process  corresponding  to 
this  subgoal  focuses  on  each  object  In  turn,  tracks  Its 
boundary,  and  looks  for  regions  that  satisfy  the 
criteria  In  this  example  it  finds  the  bottom  surface  of 
object2  as  shown  in  Figure  6.a.  Next,  surfaces  that 
have  the  potential  for  interaction  are  located  (Figure 
6.b  shows  the  surfaces  identified  for  the  current 
problem)  and  another  process  projects  the  moten  of 
moving  surfaces  that  are  identified  to  have  interaction 
potential  while  watching  for  the  occurrence  of 
deliberative  states.  The  first  deliberative  state 
detected  is  the  configuraifon  In  which  contact  occurs 
between  objects  1  and  2  and  the  dagram  is  modified 
to  depict  this  configuration,  as  shown  hi  Figure  6.C. 
The  next  subgoai  is  to  predtet  subsequent  dynamics 
of  this  configuration  and  this  Is  aoootnplished  by  die 
application  of  visual  cases.  Three  visual  cues  (the 
presoTKe  of  a  rotating  object  and  a  stationary  objiKt 
and  the  occurrence  of  a  coMsion  between  the  two)  are 
used  to  retrieve  cases,  and  visual  keys  of  cases  are 
matched  with  the  current  configurallon  by  inspecting 
its  visual  representation.  The  avaMablMy  of  symbolc 
descriptfons  as  well  as  diagrams  In  the  visual 
representation  aNowe  the  matching  of  visual  keys  h) 
proceed  at  an  abstract  level  without  recourse  to 
techniques  like  template  matching.  A  visual  case 
similar  to  the  one  shown  hi  Figure  4  (except  that  the 
moving  object  is  undergoing  rotation)  is  found  to  be 
relevant  and  appicabie.  and  the  event  that  H  predicts 
is  found  to  be  feasible  in  the  current  configuration. 
Non-visual  conditions  associated  with  this  case  are 
similar  to  those  hi  Figure  4  and  are  easily  verified  from 
the  problem  spedflcatlorL  The  resulting  prediction  is 
shown  m  Rgure  6.d.  As  the  process  model  shows, 
after  this  step  the  entire  cycle  is  repsaisd  and  In  the 
next  detected  dellberalive  stale  objecQ  has  colWed 
with  surface4.  This  time  a  case  Viat  predtets  cessadon 
of  motion  gets  appNed  and  Figure  6.e  shows  the  llnal 
configuration. 

3  Ralatad  Work 

hi  this  section  we  present  computalonsl  end  cognitive 
research  which  touches  upon  hnogery.  hi  support  of 
our  eontsntion  fiat  hnagbial  reasoning  Is  an  emerging 
research  area  thai  is  highly  promising.  CognWve 
SGMfiiMis  MM  OOTionnnHN  noi  ofiy  viv  powvnui 

row  Of  Nnopify  HI  numwi  pfoowm  fmvoiy  ow  wuo  ww 

advantages  of  incorporating  similar  reasoning 
capabMss  hi  oompuisr  programs.  For  axampis,  LsfMn 
and  Shnon  [1987]  persuMl^  argue  for  tie 


computational  advantages  afforded  by  diagrammatic 
representations  and  perceptual  inferences  that  such 
representations  support,  for  solving  physics  and 
geometry  probiems.  Koedbigsr  and  Anderson  {1990] 
describe  a  model  of  geometric  proof  problem  solving 
in  which  parsing  of  a  diagram  of  the  problem  to  detect 
specific  diagram  configurations  is  a  key  step.  These 
configurations  then  cue  relevant  schematic 
knowledge  for  proceeding  with  the  proof.  Visual 
cases  represent  a  generalization  of  this  idea. 
Logicians  have  also  noted  the  power  of  visual 
representations.  Barwise  and  Etchemendy  [1990] 
illustrate  the  role  of  visual  representations  in 
mathematical  reasoning  through  a  program  called 
Hyperproof  which  aHows  the  user  to  reason  using 
sentential  and  pictorial  forms  of  Information. 

Despite  intuitively  compelling  evidence  for  the 
use  of  imagery  by  hurnans,  there  has  not  been  much 
work  in  artificial  Intelilgonce  toward  endowing 
machines  wWi  a  shnlar  capabllty.  An  early  program 
that  utlNzod  diagrams  was  WHISPER  [Funt,  1977] 
wttich  addressed  rotation,  sliding,  and  stablNty  of 
blocks-wotid  strucbjres.  More  recently,  work  on  using 
pictoriol  or  ’anaiogicar  representations  for  shnulaang 
the  behavior  of  strings  and  liquids  in  space  has  been 
reported  {Oanlin  and  Metizer,  1989],  Shrager  [1990] 
describes  a  computational  model  of  understanding 
laser  operation  In  wMch  reinterpretalion  processes 
uMIze  event  depictions  in  an  iconic  mnnory  as  wel  as 
in  a  propositional  memory.  The  research  on 
computational  modeling  of  the  cognillve  process  of 
spatial  reasoning  wNh  digrams  [Narayanan.  1991]  is 
yet  another  step  towards  realzing  lie  full  potential  of 
imaginal  reasoning  by  computers. 

4  Concluding  Romarks 

We  have  deecribed  a  novel  approach  to  reasoning 
about  spatial  Intsractions.  Since  our  aim  in  this  paper 
has  bean  to  provide  the  reader  wHh  an  overview  of  an 
sijpilflcant  aspects  of  visual  reasonhig  within  the 
■mited  space  avstiabls,  the  descriptions  have  been 
necessarty  schematic  hi  character.  Further  detals  on 
components  of  iNo  approach  •  structure  of  visual 
representations,  how  visual  processes  are  composed 
bom  basic  visual  operations.  Indexing  and  adaptation 
of  visual  eases,  the  computer  program  that 
implements  this  model,  etc.  •  can  be  obtahted  from 
[Narayanan,  1991]. 

The  advantages  of  iMhig  Vagrams  in  this 
approach  arisa  from  the  property  that  spatial 
hifoimation  such  as  obstSBles  to  motion  or  pathways 
that  guide  motion  are  dkactiy  evidoni  hi  images.  Our 
approach  is  not  only  bauMve,  but  llsxlile  as  wel. 
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Objects  whicti  have  irre^tar  shapes  that  wifl  make 
their  algebraic  representations  complex  can  be 
represented  and  reasoned  about  in  the  stone  way  aa 
regular  objects  V  diagrams  are  used. 

As  Forbus  and  colleagues  rightly  point  out 
[Forbus  ef  a/..  1987],  there  can  be  no  purely 
qualitative  method  for  spatial  reasoning.  What  is 
required  is  to  integrate  qualitative  and  quantitative 
methods  so  that  qualitative  ones  provide  approximate 
solutions  that  serve  to  focus  the  application  of 
quantitative  methods  to  only  those  aspects  of  the 
Initiai  solutions  that  require  more  precision  or  further 
refinement.  With  this  goal  in  mind,  we  are  currently 
investigating  the  integration  of  visual  reasoning  with 
other  quaiitative  and  quantitative  methods  (Narayanan 
and  Chandrasekaran,  1991]. 
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Computer  simulation  of  devices  usually  consumes  significant  amount  of 
computational  time.  Making  simulation  more  efficient  and  goal-oriented 
might  overcome  this  problem.  The  main  objective  of  our  model  is  to  provide 
solutions  to  this  problem  by  alloying  functional  representation  (FR)  with 
structure-based  models  for  simulation. 

Based  on  the  V'HDL  framework,  we  selected  a  very  general  model  of 
device  structure: 

•  A  device  is  considered  to  consist  of  a  box  with  a  set  of  input  and  out¬ 

put  ports,  and  possibly  a  state,  and  can  be  characterized  by  a  set  of 
functions  to  convert  the  values  at  input  ports  into  specific  values  at 
output  ports  at  particular  times  while  possibly  changing  the  state  of 
the  device. 

•  A  device  can  consist  of  a  set  of  subdevices,  each  having  its  own  func¬ 

tions,  input  and  output  ports,  internal  states,  and  also  a  set  of 
connections  joining  the  ports  of  different  subdevices. 

•  Signals  (values)  at  input  ports  of  a  component  may  trigger  signals  (values) 

at  output  ports,  and  state  change  of  the  component  in  finite  amount 
of  time. 

•  It  is  assumed  that  a  connection  transfers  the  value  from  an  output  port 

across  a  connection  to  an  input  port,  (or  from  the  input  port  of  a 
component  to  the  input  port  of  its  subcomponent)  instantaneously. 
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•  Subdevices/subcomponents  can  be  embedded  in  components  of  devices  to 

any  finite  level  of  depth. 

The  above  described  framework  applies  to  a  large  variety  of  devices,  from 
software/bardware  devices  to  devices  containing  pipes  and  fluids,  etc. 

In  the  followings,  by  simulation,  we  will  always  mean  simulation  based 
on  this  kind  of  structure/behavioral  model. 

Our  major  goal  is  to 

•  show,  how  FR  can  be  smoothly  combined  with  structure-based  models  for 

simulation  in  a  common  framework, 

•  what  are  the  advantages  of  combining  the  two,  and  how  FR  can  control 

the  simulation  efforts, 

•  and  what  are  the  tradeoffs  between  controlling  simulation  by  FR  and 

simulating  brute-force  by  using  the  structural  model  alone. 


Functional  Representation  Combined  with  Struc¬ 
tural  Model 

We  supplemented  the  functional  representation  language  with  further  ele¬ 
ments  in  order  to  support  structure/behavioral  models  of  simulation.  W^e 
retained  most  of  the  earlier  additions  to  the  language,  like  the  parametriza- 
tion  of  components,  and  the  possibility  of  defining  different  relations  among 
them  (e.g.,  can-run-onfsoftware  hardware])  that  we  had  made  in  the  previous 
stage  of  our  current  project  [1]. 

We  added  several  new  features  to  the  representation: 

•  Functions  can  achieve  their  result  states  in  response  to  some  starting 
state,  and  signals  arriving  to  their  input  ports.  The  set  of  input  ports 
used  by  the  function,  and  the  set  of  output  ports  affected  by  it  are 
subsets  of  the  input  and  output  ports  of  the  component  the  function  is 
belonging  to.  The  user  specifies  by  means  of  a  transfer  function,  how 
values  at  input  ports  and  the  initial  state  of  the  component  transform 
into  values  at  output  ports  and  a  new  state  of  the  component  in  time. 
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Transfer  functions  are  ordinary  LISP  functions,  they  specify  how  FR 
functions  are  being  achieved. 

PR’s  TOMAKE  statement  that  describes  the  state  having  been  achieved 
by  the  function  has  been  retained  in  the  representation  for  explanatory 
purposes.  The  TOMAKE  statement  provides  a  higher  level/more 
abstract  view  of  the  state  being  achieved  by  the  FR-function  than  the 
transfer  function  does,  therefore,  it  can  be  more  conveniently  used  to 
understand  how  the  device  is  functioning. 

An  example  of  the  component  specification  can  be  seen  in  figure  1. 

•  Functions  can  also  have  their  own  input  and  output  ports  which  are  some 
subset  of  the  input  and  output  ports  of  the  device.  The  user  specifies  by 
means  of  a  transfer  function  (written  in  Common  Lisp),  how  values 
at  input  ports  and  the  initial  state  of  the  component  transform  into 
values  at  output  ports  and  a  new  state  of  the  component  in  time.  ^ 

By  transaction  we  mean  a  set  of  output  port  values,  and  a  state  of 
the  component  at  some  specific  time. 

The  inputs  to  the  transfer  function  are  the  values  at  the  input  ports 
of  the  function,  the  actual  time,  the  state  of  the  component,  and  the 
projected  future  transactions  of  the  component  (which  were  created  as 
a  result  of  earlier  events  associated  with  the  component);  and  it  creates 
a  new  set  of  projected  transactions,  and  a  new  state  of  the  component 
at  a  projected  later  time.  ^ 

Any  component  may  contain  subcomponents  connected  by  a  set  of 
connections  (joining  ports  of  different  components). 

‘  Estimate  transfer  functions  have  also  been  introduced.  They  are  used  for  complex 
components  /  functions  with  a  deep  substructure  hierarchy.  If  the  details  of  the  functioning 
of  the  given  component  /  function  are  not  important  concerning  our  problem-solving  goab, 
and  we  do  not  want  to  explore  it  in  depth,  we  still  may  get  an  estimate  on  the  function¬ 
ing  of  the  component/function  using  an  estimate  transfer  function.  They  work  similarly 
to  regular  transfer  functions,  but  a  short-cut  of  a  possibly  deep  structure/substructure 
hierarchy  can  be  provided.  The  results  produced  will  only  be  estimate  values. 

By  using  estimate  transfer  functions,  simulation  can  be  continued,  even  if  a  portion  of 
the  functional  hierarchy  has  not  been  explored. 

^Signals  that  are  transferred  across  the  device  carry  an  identifier  and  a  timestamp  with 
themselves,  so  the  time  they  spent  in  the  system  can  be  measured. 
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COMPONENT  A/D 
PARAMETER 

SF  2000  “Sampling  frequency:” 

BS  1024  “Buffer  size:” 

INPUT 

INIT 

signal 

OUTPUT 

sample 

FUNCTION 

InitializeA/D  TOMAKE  STATE  active(A/D)  =  ’running 

INPUT 

INIT 

TRANSFER-FUNCTION  tf-initialize-A/D 

(defun  tf-initialize-A/D(inp-list  state-var-list  parameter-list 
func-params  time  trans-list)  ... 

) 

FUNCTION 

sampling  TOMAKE  sampled-signal  AT-PORT  sample 

INPUT 

signal 

OUTPUT 

sample 

BY  TRANSFER-FUNCTION  tf  sampling 

(defun  tf-sampling(inp-Iist  state-var-list  parameter-list  func- 
params  time  trans-list)  ... 

) 

STATES 

active(’idle  ’running)  =  ’idle 
buffer-available  =  BS 
END  COMPONENT 

Figure  1:  Component  specification  for  A/D. 
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•  VVe  define  functional  groups  as  a  set  of  functions  of  a  component.  Sub¬ 

structures  within  a  component  (set  of  components  joint  by  connections) 
belong  to  one  particular  functional  group.  The  role  of  the  functional 
groups  is  that  different  subsets  of  functions  may  be  implemented  by 
different  substructures,  (but  there  is  no  intersection  between  different 
functional  groups,  or  the  substructures  belonging  to  different  functional 
groups). 

One  e.xample  for  the  use  of  functional  groups  can  be  seen  in  the  rep¬ 
resentation  of  Main-CPU  in  our  prototype.  Main-CPU  has  four  func¬ 
tions: 

INITI.ALIZE,  FFT,  Walsh,  and  filter. 

I.\ITI.\L1ZE  has  one  input  port,  ON,  and  one  output  port,  INIT. 

FFT,  Walsh,  and  filter  all  use  the  same  input  port,  signal,  and  each 
of  them  have  separate  output  ports.  It  means  that  the  substructure 
describing  FFT,  Walsh,  and  filter  intersects,  therefore,  these  3  functions 
are  grouped  into  one  functional  group. 

INITIALIZE  is  realized  by  a  completely  distinct  substructure,  with¬ 
out  any  common  part  or  connection  with  the  three  other  functions, 
therefore,  it  belongs  to  another  functional  group. 

•  As  in  the  usual  FR,  the  causal  processes  can  also  be  specified  by  which 

the  functions  are  realized.  OR  relationships  are  allowed  between  these 
causal  process  descriptions  in  order  to  represent  alternative  ways/causal 
processes  to  realize  certain  functions.  (It  is  similar  to  the  VHDL  fea¬ 
ture  that  multiple  architectures  can  be  specified  for  one  entity.)  These 
alternative  causal  processes  may  play  many  important  roles,  among 
others  they  may  be  useful  in  order  to  muntain  the  reconfigurability 
of  the  device  where  alternative  causal  processes  might  correspond  to 
alternative  configurations  (software-  hardware  assignments). 

•  Functions  or  functional  groups  can  separately  depend  on  conditions 

based  either  on  traditional  states  or  relational  states  (usually  repre¬ 
senting  some  particular  software-hardware  assignment,  but  relationship 
of  any  kind  can  be  used)  under  which  the  function  applies.  Connec¬ 
tions  between  ports  can  also  be  conditioned  either  individually  or  in 
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CONNECTIONS 

(  4  sample-main  OF  Main-CPU  =>  sample  (FFT  OF  Main-CPU) 

5  sample  (FFT  OF  Main-CPU)  =>  sample  (FFT  OF  fft-main) 

6  FFT  (FFT  OF  fft-main)  =>  fft-transform-of-sample  (FFT  OF  Main- 

CPU)) 

IF  RELATION  running-on  (fft-main  Main-CPU) 

Figure  2:  A  portion  of  the  structural  description  for  the  FFT,  Walsh,  filter 
functional  group  of  Main-CPU. 

any  kind  of  groups.  Connections  are  named  so  that  they  can  be  easily 
referenced  by  causal  process  descriptions. 

\  portion  of  the  structural  description  can  be  seen  in  figure  2. 

•  Components  can  have  instances.  Values  of  component  parameters  can 

be  separately  set  for  each  instance  at  its  creation,  (or  any  later  time). 
.\  separate  set  of  input  and  output  ports  is  created  for  each  instance  of 
the  component,  but  all  attributes  of  the  component  (functions,  trans¬ 
fer  functions,  states,  etc.)  are  inherited.  Connections  can  be  created 
between  ports  of  either  components  or  instances  (Connections  are  not 
inherited  by  the  instances.) 

•  The  state  transitions  of  causal  process  descriptions  (cptfs)  can  be 

described  either  by  connections,  or  by  some  function  of  some  compo¬ 
nent,  or  by  some  cp<fs.  The  state  transition  can  be  conditioned  on 
some  assumption  (described  by  a  PROVIDED  clause),  or  on  a  set  of 
configuration-dependent  relationships  specifying  the  applicability  of  the 
state  transition. 

The  state  or  the  value  of  some  port  can  be  made  explicit  at  any  point 
of  the  cpd.  (See  example  of  cpd  in  figure  3). 

Representation  of  Resource  Use 

Resources  are  specified  for  every  component  of  the  device  individually.  The 
amount  of  the  resource  may  depend  on  some  parameters  of  the  component 


CPD  sample-and-transfer 

PORT  signal  OF  REAL-TIME-SPECTRUM-ANALYZER 
CONNECTION  6  OF  REAL-TIME-ANALYZER 
PORT  signal  (samplingOF  A/D) 

USING  FUNCTION  sampling  OF  A/D 
PROVIDED  active(A/D)  =  ’running 
PORT  sample  (samplingOF  A/D) 

CONNECTION  7  OF  REAL-TIME-ANALYZER 
PORT  sample  (TransferSampleOF  bus) 

USING  FUNCTION  TransferSample  OF  bus 
PROVIDED  active(bus)  =  ’running 
CONNECTION  8  OF  REAL-TIME-ANALYZER 
IF  RELATION 
OR 

(running-on  (FFT-main  Main-CPU) 
running-on  (VValsh-main  Main-CPU) 
running-on  (filter-main  Main-CPU)) 

PORT  sample-main  OF  Main-CPU 
CONNECTION  9  OF  REAL-TIME-ANALYZER 
IF  RELATION 
OR 

(running-on  (FFT-dsp  Main-CPU) 
running-on  (Walsh-dsp  Main-CPU) 
running-on  (filter-dsp  Main-CPU)) 

PORT  sample-dsp  OF  Main-CPU 
END  CPD 

Figure  3:  Description  of  the  sample-and-transfer  causal  process. 
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USING  RESOURCE  (AT  ESTIMATION-LEVEL  1) 

(*  (♦  BS  t22)  0.84)  OF  Main-CPU-time  WITH  TOLERANCE  0.01 
USING  RESOURCE  (.\T  ESTIM.ATION-LEVEL  1) 

250  OF  Main-Memory  WITH  TOLERANCE  10 


Figure  4:  Resource-use  estimate  for  function  Walsh  of  Main-memory, 
or  on  the  entire  component. 

We  can  represent  resource  use  of  functions  or  configurations  at  different 
levels  of  estimation.  The  level  of  estimation  depends  on  the  depth  in  the 
representation  (estimates  at  subcomponent  level  are  usually  finer  than  esti¬ 
mates  at  component  level).  The  accuracy  of  the  estimate  is  characterized  by 
the  value  of  tolerance  attached  to  each  resource  use  specifier:  the  deeper 
the  level  of  estimation  is,  the  smaller  the  values  of  tolerances  should  be.  (See 
e.xample  in  figure  4.) 

There  are  resource  use  values  attached  to  each  software-hardware  assign¬ 
ment,  these  are  the  rawest  estimates.  The  actual  values  of  resource  use  are 
described  at  the  deepest  component  level  (that  does  not  have  already  sub¬ 
components). 

External  Conditions 

External  conditions  are  represented  in  a  separate  section  of  the  combined 
representation.  Specific  values  at  particular  ports  of  the  device  are  allowed 
to  be  placed  at  any  given  time. 

States  can  be  instantiated  to  specific  values  in  the  STATES  subsections 
of  components. 


Simulation  Process 

The  actual  simulation  starts  with  placing  the  values  specified  by  the  external 
conditions  to  the  appropriate  ports  of  the  device. 

In  every  problem-solving  cycle,  the  set  of  earliest  events  is  determined. 
An  event  can  mean  either  a  component  that  is  able  to  fire,  i.e.,  has  all  the 


needed  signals  at  its  input  ports,  or  it  can  mean  a  value  arriving  at  some 
(output)  port  which  can  be  transferred  to  another  port  through  a  connection. 

If  there  are  several  events  that  may  happen  at  the  same  time,  the  causal 
process  descriptions  can  be  used  to  order  those  possible  events  (as  will  be 
discussed  later  in  detail). 

If  a  particular  component  is  due  to  fire,  (possesses  an  active  event)  and  all 
of  its  conditions  are  satisfied,  then  either  its  transfer  function  will  be  invoked 
(if  the  component  does  not  have  subcomponents),  or  the  substructure  of  the 
component  will  be  traversed,  and  the  input  signals  will  be  transferred  through 
the  connections.  The  simulation  process  is  complicated  by  goal-dependent 
decisions  ^  as  it  will  be  discussed  later. 

The  simulation  terminates  if 

•  either  the  predefined  simulation  time  has  been  elapsed, 

•  or  there  are  no  more  transactions  left  in  the  system,  i.e.  no  more  event 

can  happen, 

•  or  our  problem-solving  goal  has  been  satisfied. 

Advantages  of  the  Simulation  -  Functional  Rep 
resentation  Symbiosis 

Placing  FR  at  the  top  of  the  simulation  model  results  in  representational 
benefits,  on  one  hand,  and  also  provides  significant  advantages  in  controlling 
the  problem-solving  process,  on  the  other  hand. 

Representational  Benefits 

The  FR  language  makes  a  very  important  distinction  possible  between  func~ 
itons  and  components  (as  parts  of  the  structure  of  the  device). 

The  simulation  language,  VHDL,  can  also  represent  functions,  the  entities 
in  VHDL  can  represent  either  functional  or  structural  elements.  Nevertheless, 
VHDL  cannot  make  any  distinction  between  the  two,  cannot  describe  how 

^e.g.,  estimating  resource  use  and  checking  on  the  estimates  in  each  step  if  the  goal  is 
deciding  resource-satisfiability 


structural  and  functional  elements  are  related  to  each  other,  and  it  may  be 
a  major  drawback  for  several  purposes  (for  example,  diagnosis).  ‘‘ 

■Another  advantage  of  using  FR  is  that  causal  processes  that  describe  the 
beha\ior  of  the  device  are  represented  explicitly.  Though,  this  information 
is  included  also  in  the  VHDL  description  of  the  device,  in  the  form  of  how 
signals  traverse  the  system,  it  is  hidden  within  the  representation,  and  cannot 
be  used  efficiently  for  controlling  problem-solving.  In  particular,  we  would 
like  to  use  causal  process  descriptions  to  control  simulation  based  on  pre¬ 
defined  problem-solving  goals. 

.Another  advantage  of  our  representation  is  that  causal  process  descrip¬ 
tions,  or  even  the  structure  of  the  device  (the  connections  between  the  parts) 
can  all  be  conditioned  either  on  assumptions,  or  on  configuration-dependent 
relations,  making  e.xplicit  representation  of  reconfigurability  possible. 

Using  the  Combined  Model  for  Problem-Solving 

Our  main  objective  is  to  reduce  simulation  efforts  based  on  the  goals  we 
would  like  to  achieve  by  the  simulation.  Currently,  our  first  goal  was  to  find 
out  whether  a  given  configuration  would  work  without  exhausting  any  of  the 
resources,  the  second  goal  was  to  determine  the  values  of  the  signal  at  a 
pre-defined  set  of  ports. 

Deciding  Resource-use  Satisfiability 
Our  strategy  was  the  following: 

•  In  the  first  step,  we  determined  the  set  of  critical  resources  based  on 

the  highest-level  resource-use  estimates.  ® 

•  Simulation  has  been  applied  in  an  intelligent  fashion  to  find  out  more 

about  critical  resources  (it  will  be  discussed  later  in  detail). 

^Example:  Exactly  the  same  structural  port  may  be  used  by  more  than  one  function 
of  the  component.  In  the  ideal  case,  the  different  kinds  of  uses  of  that  port  can  be  examined 
independently.  However,  if  the  component  fails,  the  two  uses  of  that  port  can  conflict  with 
each  other. 

In  our  prototype,  the  signal  port  of  Main-CPU  is  used  by  3  different  functions,  FFT, 
Walsh,  and  filter,  every  other  port  is  used  by  exactly  one  function. 

*Critical  resources  are  those  for  which  we  cannot  decide  whether  they  can  or  cannot 
be  satisfied  based  on  earlier  information. 


200 


•  If  one  of  the  critical  resources  has  been  exhausted,  i.e.,  we  know  for  sure 

that  the  given  conHguration  cannot  function,  the  simulation  will  be 
stopped  immediately. 

•  If  one  of  the  critical  resources  turns  out  to  be  sufficient  without  any 

ambiguity,  it  will  be  removed  from  the  set  of  critical  resources. 

•  If  there  are  no  more  critical  resources  left,  the  simulation  effort  will  be 

stopped.  ® 

•  If  the  simulation  terminates  due  to  either  of  the  termination  conditions 

described  earlier  in  the  ’Simulation  Process’  section,  then  the  remaining 
set  of  critical  resources  cannot  be  resolved  by  the  simulation,  only  the 
estimated  value  and  the  tolerance  can  be  decided. 

Determining  Values  of  Signals  at  a  Predefined  Set  of  Ports 

This  goal  is  simpler  to  implement  than  the  previous  task.  Simulation  should 
be  performed  that  terminates  if  every  goal  has  been  satisfied,  i.e.  every 
interesting  port  value  has  been  determined. 

In  case  of  both  goals,  our  effort  has  been  focused  on  intelligently  con¬ 
trolling  simulation  based  on  FR.  We  developed  two  different  strategies  for 
this  purpose,  and  applied  those  in  a  combined  fashion: 

•  One  method  was  to  control  the  depth  of  reasoning  based  on  problem¬ 

solving  needs  (i.e.,  not  to  descend  deeper  into  the  component-hierarchy 
than  required). 

•  The  other  strategy  was  to  avoid  following  irrelevant  simulation  paths 

which  do  not  contribute  to  our  problem-solving  goals. 

Controlling  the  Level  of  Reasoning 

Most  devices  can  be  examined  at  many  different  levels  of  detail,  from  the  level 
of  the  entire  device  to  the  level  of  components,  subcomponents  to  the  smallest 
represented  parts.  Significant  amount  of  computational  time  can  be  saved  if  it 

may  depend  on  our  additional  simulation  goals  whether  outputs  should  be  created 
as  a  result  of  the  functioning  of  the  device. 
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Figure  5:  Resource  use  hierarchy  for  the  resource  Main-Memory.  The  topbox 
means  the  entire  device,  each  of  the  rest  of  the  boxes  mean  a  function  of  cer¬ 
tain  component,  and  its  resource  utilization,  or  resource  utilization  estimate 
(in  parentheses)  with  the  tolerance  (in  brackets). 

can  be  decided  during  problem-solving  where  it  is  worth  skipping  the  details. 
Our  strategy  is  that  we  should  go  deeper  into  the  component /functional 
hierarchy  if  and  only  if  we  can  gain  valuable  information  by  that  concerning 
our  particular  simulation  goal  (i.e.,  related  to  at  least  one  of  the  critical 
resources,  or  interesting  ports). 

In  case  of  resource-use  satisfiability,  the  implementation  of  this  strategy 
consists  of  two  main  steps: 

•  In  one  step,  a  set  of  resource  use  hierarchies  has  been  built  for  each 

resource  before  beginning  the  simulation  process.  This  set  contains 
every  function  that  uses  the  given  resource,  parent /child  functions  are 
mutually  linked  to  each  other. 

.\n  example  for  a  resource  use  hierarchy  can  be  seen  in  figure  4. 

•  Every  time  the  simulator  discovers  a  new  substructure  which  could  be 

explored,  it  will  decide  whether  it  is  worth  exploring  it  by  analyzing 
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the  resource  use  hierarchies  of  the  critical  resources.  If  it  has  been 
decided  that  a  short-cut  can  be  taken,  i.e.,  going  into  more  details 
would  not  provide  any  additional  information  on  any  of  the  interesting 
resources,  then  the  estimate  transfer  function  of  the  current  function 
will  be  used  (that  the  simulation  can  continue). 

In  case  of  determining  port-values,  a  short-cut  can  be  taken  while  travers¬ 
ing  a  module  if  no  interesting  port  resides  within  that  module. 


Controlling  Simulation  By  Means  of  Causal  Process 
Descriptions 

In  the  first  run,  we  applied  simulation  without  the  e.xplicit  use  of  cpd’s.  We 
discovered  several  inefficiencies: 

•  Let’s  assume  that  the  device  contains  two  causal  processes,  .4,  and  B,  and 

B  causally  depends  on  .4.  Assume  that  each  of  them  can  be  executed 
by  the  simulator  simultaneously,  i.e.,  there  are  potential  transactions 
with  the  same  timestamps  in  the  waiting  queue  belonging  to  each  of 
the  paths.  The  simulator  selects  one  of  the  transactions  randomly  to 
execute,  therefore,  let’s  assume  that  the  transaction  on  path  B  will  be 
selected,  and  the  same  happens  also  in  the  subsequent  steps.  Then, 
at  one  point,  the  transaction  picked  on  path  B  will  fail  because  of  the 
causal  dependency,  and  the  simulator  will  continue  on  path  A.  ’’ 

We  could  have  saved  this  failing  step  by  ordering  transactions  accord¬ 
ing  to  the  causal  paths  they  are  belonging  to.  ® 

•  Another  source  of  inefficiency  is  if  simulation  is  run  on  causal  paths  that 

are  completely  irrelevant  of  the  problem-solving  goal,  i.e.,  if  some 
processes  are  followed  that 

^We  had  an  example  of  this  situation  in  our  prototype:  Our  prototype  consists  of 
two  major  causal  processes,  the  first  one  initialises  everything  in  the  device,  and  the 
second  one  performs  the  actual  functions  of  the  device  (sampling,  then  calculating  FFT, 
Walsh-  transforms  or  filter  the  signal,  etc.).  If  we  attempt  to  do  sampling  before  .A./D 
(that  performs  sampling)  has  been  initialised,  this  step  will  fail.  We  may  waste  valuable 
computational  steps  by  attempting  it  more  times. 

*An  ordinary  simulator  checks  on  the  conditions  of  all  the  earliest  events  in  each  cycle, 
and  picks  the  one  with  satisfying  conditions.  Our  simulator,  on  the  other  hand,  intelligently 
knows  which  event  to  pick  without  this  condition-checking. 


•  do  not  use  the  critical  resources,  and 

•  no  other  processes  of  the  device  depend  on  them  which  use  any  of 

the  critical  resources.  ^ 

In  order  to  overcome  the  above  described  inefficiencies,  we  can  use  PR’s 
causal  process  descriptions  to  control  simulation. 

The  implementation  of  this  strategy  requires  pre-processing  of  the  rep¬ 
resentation  before  the  problem-solving  process  can  start:  cpd’s  are  pairwise 
analyzed,  whether  there  is 

•  any  causal  dependency  between  them 

•  either  some  assumption  required  by  one  process  is  established  by 

the  other  one, 

•  or  some  port- value  required  by  one  of  them  has  been  filled  by  the 

other  one. 

•  any  resource  dependency  between  them  (i.e.,  do  they  use  the  same 

kind  of  resource). 

The  kind  of  preliminary  analysis  of  PR’s  causal  process  descriptions  as  de¬ 
scribed  above  is  independent  of  our  particular  problem-solving  goals. 
Simulation  can  be  controlled  by  the  cpd’s  in  the  following  way: 

•  As  discussed  earlier,  in  every  simulation  step,  the  set  of  earliest  transac¬ 

tions  is  selected  (that  can  happen  simultaneously).  Por  every  transac¬ 
tion,  we  determine  the  particular  causal  process  it  belongs  to,  then 

•  each  of  the  cpd’s  will  be  checked  whether  it  is  relevant  to  the  problem¬ 

solving  task,  if  not,  then,  it  will  be  removed  from  the  set  of  active 
events, 

^An  example  from  our  prototype: 

It  turns  out  at  the  rawest  estimation-level  that  MAIN-MEMORY  is  our  critical  resource. 
In  our  configuration,  filler  runs  on  DSP.  From  the  causal  dependency  analysis  of  our 
device,  it  will  be  clearly  seen  that  the  filler  process  does  not  influence  our  decision  on 
MAIN-MEMORY-use:  it  does  not  use  MAIN-MEMORY  itself,  and  no  other  processes 
depend  on  it  that  use  that  resource. 
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•  the  cpd’s  of  active  transactions  are  checked  pairwise  for  causal  depen¬ 

dencies  between  them,  and  they  will  be  ordered  to  be  acted  upon 
accordingly. 

One  important  feature  of  the  above  described  implementation  strategy  is 
that  goal-dependent  and  goal-independent  steps  are  well-separated: 

•  The  parsing  and  preliminary  analysis  of  FR  (analysis  of  cpd’s)  are  goal- 

independent,  the  simulation  process  itself  similarly. 

•  During  the  simulation  process,  some  goal-dependent  decision-functions 

should  be  called  in  order  to  control  simulation,  to 

•  order  simulation  steps,  and 

•  decide  possibilities  to  prune  irrelevant  simulation  paths. 


Tradeoffs  between  Controlling  Simulation  by 
FR  and  Pure  Simulation 

.As  we  could  see  in  the  previous  section,  controlling  simulation  by  FR  costs 
plenty  of  overhead.  However,  a  significant  part  of  additional  computation 
can  be  precalculated,  and  most  of  the  simulation-time  overhead  can  be  econ¬ 
omized  by  additional  prior  computations. 

Whether  it  is  worth  the  extra  effort,  depends  strongly  on  our  goals,  and 
the  kind  of  device  being  simulated.  The  more  paths  can  be  weeded  out  before 
performing  simulation  on  it,  the  more  economically  our  suggested  method  can 
be  used. 

If  we  have  a  few  pointed  goals,  then  it  is  almost  sure  that  using  FR 
will  lead  to  significant  savings.  On  the  other  hand,  if  a  vast  amount  of 
information  should  be  collected,  we  may  need  to  think  about  some  other, 
more  appropriate  way  to  reduce  simulation  efforts. 
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A  prototype  system  for  automated  generation  and  testing  of  software/hardware 
configurations  can  benefit  from  using  device  representations  that  focus  on 
functions.  These  capabilities  can  support  designers  in  analyzing  the  de¬ 
signed  device.  For  example,  our  benefit  is  that  some  major  characteristics, 
like  resource  uses  or  fault-tolerance  can  be  examined  without  performing  de¬ 
tailed  simulation.  Based  on  our  ability  to  generate  and  test  configurations, 
we  gave  an  example  of  how  automated  reconfiguration  can  take  place  which 
can  support  fast  reconfiguration  in  case  of  component-failures.  Another  in¬ 
teresting  benefit  of  our  problem-solver  is  that  functions  can  be  prioritized  so 
that  if  not  all  functions  can  be  achieved,  a  degraded  mode  of  operation  may 
still  be  possible. 

1  Using  the  Functional  Representation  Lan¬ 
guage  for  Automated  Reconfiguration 

Understanding  a  device  means,  in  part,  understanding  how  its  functions  are 
achieved  by  way  of  the  functions  and  behaviors  of  its  components  and  their 
subcomponents.  This  kind  of  knowledge  can  be  used,  for  example,  for  diag¬ 
nostic  reasoning  [1],  causal  explanation  of  diagnostic  conclusions  [2],  control 
of  simulation  [5],  or  debugging  of  computer  programs  [3].  In  case  based  de¬ 
sign,  the  known  functions  of  pre-stored  designs  can  provide  indexes  to  assist 
with  finding  the  closest  pre-stored  design  case  to  the  desired  design  [4]. 


Systems  comprised  of  software  running  on  hardware  can  be  considered  as 
devices.  The  hardware  elements  and  the  programs  running  on  them  are  the 
components  of  the  device,  and  also  the  information  that  moves  around  and 
is  transformed  during  processing  is  special  kind  of  component  (we  call  such 
things  ’movable  substances*).  Our  work  is  based  on  extensions  to  the  Func¬ 
tional  Representation  language  (FR),  first  described  by  Sembugamoorthy 
and  Chandrasekaran  in  [1].  The  FR  language  describes  a  device  as  some¬ 
thing  which  has  one  or  more  known  functions.  The  device  is  made  up  of 
components  and  the  main  device  achieves  its  functionality  by  employing  the 
services  of  its  components. 

FR  was  originally  devised  for  physical  devices,  but  as  [3]  has  shown, 
programs  can  also  be  considered  as  devices.  In  our  domain,  a  program  can 
usually  run  on  multiple  hardware  components,  i.e.  the  same  functionality 
can  be  achieved  in  several  ways. 

By  a  particular  configuration  of  the  device  we  mean  a  specific  assign¬ 
ment  of  software  to  hardware. 

Reconfiguration  means  reassignment  of  software  to  hardware  which 
may  be  needed  as  a  consequence  of  a  hardware  failure,  or  in  order  to  achieve 
a  better  utilization  of  resources. 

Estimating  the  resource  use  of  a  particular  configuration,  and  recon¬ 
figuring  the  device  around  faults  may  be  major  concerns  of  users.  We  can 
represent  the  device  either  independently  of  the  particular  configuration,  i.e., 
regardless  of  which  software  is  actually  running  on  which  hardware  element, 
or  the  representation  may  reflect  the  specific  association  between  the  software 
and  the  hardware  the  program  is  running  on. 

We  supplemented  the  Functional  Representation  language  with  further 
primitives  which  make  it  more  suitable  for  reasoning  about  devices  consisting 
of  software  and  hardware  components  and  solve  the  desired  problem-solving 
tasks.  A  small  prototype  has  been  created  (see  figure  1),  and  the  enhanced 
representation  has  been  used  to  solve  several  problems: 

•  Check  the  consistency  of  the  configured  device: 

•  Will  the  desired  functions  be  achieved  by  the  device  ? 

•  How  much  of  the  resources  are  used  on  the  average  during  its  func¬ 

tioning  ? 
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displays  FFT, 

Walsh-transfotm, 
filtered  signal 

Parameters: 

size  of  the  buffer,  sampling  frequency,  and 

different  resource  parameters  of  the  various  software  and  hardware  components 


Figure  1:  Simple  signal-processing  device  consisting  of  an  A/D  converter,  a 
digital  signal  processor,  a  display,  a  mmn  computer,  and  a  timer  represented 
in  the  prototype. 
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•  Check  all  possible  alternative  configurations,  and  figure  out  which  of  the 

desired  functions  are  achieved  by  each  of  them,  and  estimate  the  re¬ 
source  use  for  each. 

•  Try  to  find  all  possible  configurations  if  one  or  more  of  the  components 

failed.  Assign  a  priority  order  to  functions  and  try  to  achieve  high 
priority  functions  first.  Determine  which  functions  can  be  achieved  in 
this  degraded  mode  (if  not  all  of  them  are  achievable). 

2  Extensions  to  the  Functional  Representa¬ 
tion  Language 

The  Functional  Representation  language  represents  the  structure  of  the 
device  in  terms  of  components  and  relations  among  them;  the  functions  of 
the  device  and  its  subdevices  at  several  levels  of  abstraction,  and  provides  a 
causal  process  description  (in  other  words,  behaviors  of  the  device)  of 
how  the  various  functions  are  achieved  by  the  components/  subcomponents 
of  the  device  while  assumptions  made  and  generic  knowledge  used  are  also 
represented  explicitly.  The  device  and  its  components  can  be  in  different 
states  during  their  operation,  and  a  particular  behavior  of  the  device  can  be 
described  in  terms  of  a  chain  of  state  transitions. 

Our  particular  field  of  application,  (software/  hardware  environment)  re¬ 
quired  several  extensions  to  be  made  to  the  existing  Functional  Representa¬ 
tion  language: 

•  Parameters. 

The  components  of  the  device  may  have  parameters,  the  values  of  which 
are  determined  externally  by  the  user  (size  of  the  buffer,  sampling  fre¬ 
quency,  etc.).  A  variable-name,  a  default-value,  and  some  description 
(text)  is  assigned  to  each  parameter. 

•  Resources  are  assigned  also  to  components,  the  type,  amount  and 

unit  is  specified  for  every  resource  (example:  128  blocks  of  RAM). 
The  amount  of  the  resource  may  be  expressed  as  a  function  of  the 
parameters  of  the  components. 
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•  Resource  uses  are  attached  to  behaviors  (the  amount  of  certain  kind  of 

resource  used  during  some  state  transition  is  specified  along  with  the 
particular  state  transition). 

•  OR  relationship  is  used  between  alternative  behaviors  which  achieve  the 

same  state  transition.  The  alternative  branch  being  followed  depends 
usually  on  a  configuration- related  condition  (how  the  software  is  as¬ 
signed  to  hardware).  Each  alternative  behavior  may  require  a  different 
set  of  software/  hardware  assignments. 

•  Two  specific  relations  for  the  software/hardware  context,  running-on 

and  can-run-on  are  attributed  special  meaning: 

•  Running-on  means  that  the  given  software  can  run  on  the  set  of 

hardware  components,  and  it  is  initialized,  loaded,  started,  etc. 

•  Can-run-on  means  only  the  ability  of  the  software  to  run  on  the 

particular  hardware. 

The  meaning  of  these  two  relationships  are  hard-coded  in  our  current 
representation  language.  It  would  be  nicer  in  the  future  to  extend  out  rep¬ 
resentation  with  new  primitives  which  make  it  possible  for  the  user  to  define 
his  own  types  of  relationships. 


3  Discussion 

In  the  followings,  we  will  discuss,  how  the  reconfiguration  task  (finding  al¬ 
ternative  configurations)  is  supported  by  representing  the  functions  of  the 
device,  in  other  words,  how  strongly  FR  is  used  in  our  problem-solving  tasks. 

The  first  task  checks  the  consistency  of  the  function/behavioral  descrip¬ 
tion  of  the  device  and  estimates  the  average  resource  use  by  following  causal 
paths  in  the  representation.  In  this  task,  the  FR  is  used  only  as  an  organi¬ 
zation  principle  to  describe  the  device  in  terms  of  functions/  components/ 
subfunctions,  etc. 

Estimated  resource  uses  are  attached  to  functions,  behaviors,  achieving 
those  functions,  or  the  state-transitions,  constituting  behaviors,  at  differ¬ 
ent  levels  of  the  functional  representation.  In  order  to  estimate  the  overall 
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amount  of  resources  needed  to  achieve  some  function,  the  FR  is  used  to  de¬ 
termine  all  the  lower  level  functions,  behaviors  associated  with  that  function, 
so  that  the  resource  uses  belonging  to  those  lower  level  functions,  behaviors, 
state-transitions  can  be  accumulated. 

Checking  the  consistency  of  the  FR  of  the  device  means  crossing  the 
FR  hierarchy  associated  with  each  function  of  the  device,  and  following  all  the 
possible  paths,  how  it  is  realized.  It  should  be  checked  whether  each  behavior 
achieving  some  function  is  defined  in  the  FR,  and  really  achieves  the  desired 
function,  and  all  the  states  constituting  the  corresponding  state-transitions 
are  declared  for  the  appropriate  components. 

The  second  task  generates  all  possible  alternative  configurations 
combinatorially,  and  tries  each  of  them  individually.  The  method  of  creating 
alternative  configurations  builds  fundamentally  on  the  FR.  Two  configura¬ 
tions  are  considered  as  alternatives  based  on  their  functionality. 

Alternative  configurations  are  generated  combinatorially,  in  all  possible 
ways.  The  number  may  be  very  large,  the  system  does  not  have  any  knowl¬ 
edge  to  guide  generation  or  to  select  the  best  alternatives.  Therefore,  the 
system  must  be  supplemented  with  further  knowledge  for  this  purpose. 

The  implementation  first  creates  a  list  of  possible  assignments  (function, 
software- component,  hardware-components)  triples,  based  on  the  'can-run- 
on'  and  'running-on'  relations  specified  in  the  RELATIONS  section  of  the 
representation. 

The  set  of  possible  assignments  is  fUteredhased  on  the  desirable  functions, 
using  the  functional  representation:  only  those  possible  assignments  remain 
in  the  list  which  contribute  to  at  least  one  of  the  functions  to  be  achieved. 
It  is  implemented  by  crossing  the  FR  hierarchy  associated  with  each  desired 
function. 

Alternative  behaviors  (which  achieve  the  same  state  transition  in  different 
ways)  are  specified  by  OR  relations  between  behaviors  realizing  functions  or 
state-transitions  (as  mentioned  in  the  previous  section  as  an  extension  to  the 
FR  language).  Alternative  behaviors  are  used  to  create  alternative  subsets 
of  possible  assignments  (of  software  to  hardware  components),  each  subset 
belonging  to  one  of  the  behaviors  (see  figure  2).  Assignments  belonging  to  a 
behavior  are  parts  of  the  conditional  description  of  the  behavior,  (i.e.  provide 
conditions  for  the  validity  of  the  behavior). 


A  -  running-on  -  B 
C  -  running-on  -  D 


A  -  running-on  -  X 
Y  -  running-on  -  W 
Z  -  running-on  -  T 


A  -  running-on  -  B 
M  -  running-on  -  N 


and 


aD  -  running-on  -  E 
F  -  running-on  -  G 


H  -  running-on  -  J 


Figure  2.  Alternative  sets  of  possible  assignments. 

Actual  configurations  are  selected  in  all  possible  combinations  using  a 
brute- force  approach  based  on  the  alternative  subsets  of  possible  assignments, 
one  assignment  is  picked  from  each  set  of  alternatives,  in  all  possible  ways. 

The  third  problem  is  to  design  around  faults.  It  is  similar  to  the  previ¬ 
ous  task,  but  it  generates  only  those  configurations  which  do  not  utilize  the 
faulty  component(s).  Before  creating  alternative  subsets  of  possible  assign¬ 
ments,  all  assignments  are  checked  for  faulty  components,  and  assignments 
containing  one  or  more  faulty  component(s)  are  removed  from  the  list. 

Desirable  functions  may  be  organized  in  a  priority  order.  The  possi¬ 
ble  assignments  are  ordered  based  on  the  priority  order  of  functions:  The 
functional  representation  of  the  device  is  used  to  decide  which  possible  as¬ 
signments  are  associated  with  each  of  the  functions,  and  possible  assignments 
related  to  the  highest  priority  function  will  be  at  the  beginning  of  the  pos¬ 
sible  assignment  list,  etc.  The  goal  of  this  ordering  is  that  we  want  to  make 
sure  that  we  cannot  run  out  of  some  resource  required  for  the  achievement  of 
some  high  priority  function  in  consequence  of  the  resource  use  of  some  lower 
priority  function,  i.e.  we  want  to  ensure  that  higher  priority  functions  are 
achieved  before  lower  priority  functions. 

4  Conclusions 

We  have  shown,  how  we  can  reason  about  software/  hardware  configuration, 
and  reconfiguration  based  on  the  Functional  Representation  of  the  device. 
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Our  program  calculated  the  estimated  average  resource  uje  of  the  device 
during  its  functioning.  Our  representation  contains  only  average  values,  since 
the  device  is  described  at  a  very  high  level,  the  function  level,  the  processes 
underlying  the  functions  are  not  represented.  Therefore,  a  detailed  quan¬ 
titative  simulation  cannot  be  completely  replaced  by  using  the  functional 
representation,  but  FR  may  help  focus  simulation  on  the  most  interesting, 
and  critical  possibilities. 


References 

[1]  Sembugamoorthy,  V.,  and  Chandrasekaran,  B.:  Functional  representa¬ 
tion  of  devices  and  compilation  of  diagnostic  problem  solving  systems. 
1986. 

[2]  Keuneke,  Anne  M.:  Machine  Understanding  of  Devices.  Causal  Ex¬ 
planation  of  Diagnostic  Conclusions.  PhD  thesis  1988. 

[3]  AUemang,  Dean  T.:  Understanding  Programs  as  Devices.  Ph.D.  thesis, 
1990. 

[4]  Goel,  K.  Ashok:  Integration  of  Case-Based  Reasoning  and  Model- 
Based  Reasoning  for  Adaptive  Design  Problem  Solving,  PhD  thesis, 
1989. 


[5]  Sticklen,  J.:  MDX2,  An  Integrated  Medical  Diagnostic  System.  PhD 
thesis.  The  Ohio  State  University,  1987. 


Multilevel  causal-process  modeling: 
bridging  the  plan,  execution,  and  device-implementation  gaps 


Keith  Levi 


Dale  Moberg 


Maharishi  International  University  The  Ohio  State  University 

Computer  Science  Department  Laboratory  tot  Artificial  Intelligence  Research 
Fairfield,  lA  52557  Columbus,  OH  43210 


Christopher  Miller  and  Fred  Rose 


Honeywell,  Inc. 
Systems  and  Research  Center 
Minneapolis,  MN  55418 


1.  INTRODUCTION 


Knowledge-based  systems  typically  require  extensive  knowledge  about  the  domain  in  which  they  are  to  operate,  in¬ 
cluding  knowledge  about  the  objects  and  systems  which  exist  within  the  domain,  their  properties,  architectures,  capa- 
bihties,  and  interactions.  This  is  particularly  the  case  for  systems  that  require  complex  reasoning-e.g.,  systems  that 
perform  explanation,  model-based  and  qualitative  reasoning,  multilevel  and  integrated  reasoning,  design/tedesign, 
training  and  tutoring,  and  certain  systems  for  knowledge  acquisition  and  machine  learning.  Acquiring  and  represent¬ 
ing  knowledge  at  the  breadth  and  depth  required  for  these  approaches  is  especial’/  challenging  in  complex,  real-world 
domains  such  as  industrial  and  aerospace  applications.  On  the  other  hand,  it  is  commonplace  in  these  domains  for 
equipment  to  be  well  documented,  both  before  and  after  manufacture,  with  detailed  specifications  about  its  func¬ 
tionality  and  performance  characteristics.  It  wiU  soon  be  the  case  that  most  of  these  equipment  specifications  will 
be  written  and  developed  as  executable  software  simulations  in  hardware  description  languages  such  as  the  VHSIC 
Hardware  Description  Language  (VHDL). 

VHDL  was  developed  during  the  early  1980*s  as  a  standard,  vendor  independent  description  of  hardware.  With  the 
growing  complexity  of  microelectronics,  it  was  clear  that  schematic  representations  would  no  longer  be  adequate 
for  description  and  subsequent  support.  VHDL  was  standardised  by  the  IEEE  in  1987,  thereby  giving  the  industry 
a  standard  hardware  description  language.  VHDL  supports  multiple  levek  of  abstraction,  but  is  most  useful  at 
describing  the  functional  or  behavioral  aspect  of  hardware.  It  allows  the  designer  to  write  a  high  level  description 
of  a  piece  of  hardware.  This  description  can  then  be  used  for  simulation,  analysis,  and  by  a  multitude  of  other  tools 
that  need  to  understand  the  structure  and  behavior  of  a  specific  piece  of  hardware. 

Clearly,  it  would  be  of  great  benefit  for  knowledge-based  systems  to  obtain  some  of  their  prerequisite  knowledge 
from  such  sources.  As  one  prominent  researcher  in  the  area  of  explanation  based  learning  speculated:  "We  foresee  a 
future  in  which  manufacturers  of  component  equipment  themselves  provide  descriptions,  in  some  standard  knowledge- 
based  formalism,  of  the  functionality  of  their  product  just  as  today  they  provide  technical  descriptions  in  a  form 
understandable  to  [equipment]  designers.  As  the  manufacturer  makes  available  refined  versions  of  installed  equipment, 
the  old  knowledge-based  description  of  the  component  is  simply  supplanted  with  the  new.  The  knowledge  engineer 
need  only  oversee  incorporation  of  the  new  knowledge  insuring  that  there  are  no  negative  interactions  that  harm 
overall  system  performance.”  * 

Realising  this  vision  of  the  future  involves  several  technical  chaUenges.  One  challenge  is  it  to  bridge  the  represen¬ 
tational  gap  between  the  syntax  of  hardware  design  languages  and  AI  systems.  We  do  not  see  this  as'  a  significant 
technical  hurdle  because  several  studies  have  already  demonstrated  that  this  can  be  done.  These  investigations  have 
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used  AI  reasoning  systems  to  reason  about  hardware  models  *'*'^'*  or  have  done  model-based  reasoning  using  VHDL 

e,7 


A  more  significant  challenge  involves  bridging  a  gap  in  semantics.  By  this  we  mean  going  from  reasoning  at  a  relatively 
weU-defined  and  self-contained  level  of  hardware  devices  to  planning  and  executing  actions  in  the  real  world  involving 
uncertainties  associated  with  external  agents  and  incomplete,  uncertain,  and  incorrect  information.  All  the  previous 
studies  we  have  cited  principally  reasoned  about  behaviors  and  functions  of  a  device  and  its  subcomponents,  not 
about  the  function  of  the  device  and  its  subcomponents  in  the  real  world.  Although  one  can  envision  modeling 
the  whole  world  as  a  device  in  which  any  given  device  is  a  subcomponent,  this  has  generally  not  been  a  tractable 
approach.  It  is  this  gap  between  modeling  plans  and  their  execution  in  the  real  world  and  modeling  the  internal 
behavior  of  devices  in  hardware  design  languages  that  is  the  focus  of  this  paper. 

1.1  Domain  knowledge  requirements  for  real  world  applications  of  EBL 

Although  there  ate  many  AI  reasoning  systems  that  could  benefit  from  having  a  detailed  source  of  domain  knowledge, 
our  motivation  for  pursuing  this  research  has  come  from  an  application  of  machine  learning  in  the  domain  of  pilot 
aiding.  DARPA  and  the  U.S.  Air  Force’s  Pilot’s  Associate  (PA)  is  a  coordinated  suite  of  five  expert  systems  which 
cooperate  to  aid  the  pilot  of  an  advanced  tactical  fighter.  Intelligent  associate  systems  such  as  PA  *  support  real-time 
planning  and  decision  making  in  dynamic  and  evolving  situations.  An  automated  associate  aids  a  human  pilot  in 
the  performance  of  a  mission  by  obtaining  and  compiling  information,  recommending  courses  of  action  and,  in  some 
cases,  actively  performing  certain  mission  tasks. 

PA’s  problem-solving  architecture  employs  distributed,  cooperating  modules.  Situation  Assessment  and  System 
Status  subsystems  combine  various  functions  involving  monitoring  the  internal  aircraft  and  external  ground  and 
air  environments.  Mission  and  Tactics  Planners  coordinate  the  planning,  scheduling  and  routing  functions.  A 
Pilot  Vehicle  Interface  subsystem  serves  to  assist  in  execution  functions  by  delivering  information  to  a  human  pilot, 
making  adjustments  to  information  supplied  in  accordance  with  his  apparent  intentions,  and  monitoring  actions 
accomplished. 

We  have  been  conducting  a  research  program  titled  Learning  Systems  for  Pilot  Aiding  (LSPA),  to 

automate  portions  of  the  off-line  process  of  incorporating  new  information  into  the  knowledge  bases  of  PA  and 
then  propagating  pertinent  changes  to  various  PA  modules.  An  initial  study  recommended  Explanation- Based 
Learning  (EBL)  as  having  the  best  payoff  for  knowledge  acquisition  in  this  domain.  EBL  requires  a  domain  theory  of 
facts  and  relationships  in  the  target  domain,  as  well  as  a  learning  instance,  a  new  case  not  previously  understood  by 
the  system  In  LSPA,  the  learning  instance  can  come  from  records  of  actual  pilot  behavior  in  a  flight  simulator. 
The  domain  theory  consists  of  a  set  of  relatively-unchanging  facts  about  the  physicsd  capabiUties  of  the  aircraft 
and  the  world  which  it  inhabits.  When  these  facts  and  relationships  change,  such  as  when  new  technologies  are 
incorporated,  the  domain  theory  must  be  updated.  Given  an  up-to-date  domain  theory,  however,  novel  tactics  can 
be  learned  automatically  from  a  record  of  the  flight  in  which  they  occurred.  The  system  builds  an  explanation  of 
how  a  known  goal  was  accomplished  via  its  understanding  of  the  world.  This  explanation  can  then  be  generalised  to 
produce  a  new  plan  to  be  incorporated  into  PA. 

A  critical  issue  for  any  application  of  explanation-based  learning  is  the  scope  and  nature  of  the  domain  theory 
required  for  generating  explanations.  Our  domain  theory  rules  presently  cover  much  of  a  relatively  self-contained 
portion  of  the  PA  domain,  the  domain  for  defeating  surface-to-ur  missiles.  This  is,  however,  only  a  small  portion 
of  the  domain  that  the  PA  system  deals  with.  While  our  domain  theory  is  successful  as  a  feasibility  demonstration, 
the  issue  of  whether  we  can  scale-up  our  system  to  cover  the  entire  PA  domain  remains  an  important  open  question. 

If  the  required  domain  theory  for  EBL  is  sufficiently  large  then  it  might  be  more  work  to  build  this  domain  theory  than 
it  would  be  to  simply  build  all  of  the  tactical  plans  directly.  We  believe,  however,  that  this  will  not  be  the  case.  An 
EBL  domain  theory  is  essentially  a  model  of  the  primitive  functionality  of  the  aircraft  and  its  environment.  Tstctical 
plans,  in  contrast,  model  all  known  behaviors  arising  from  the  functionality  of  the  aircraft  and  its  environment.  This 
is  analogous  to  a  set  of  axioms  and  the  set  of  all  possible  theorems  that  can  be  derived  from  the  axioms.  The  former 
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is  typically  a  finite  set  and  the  latter  is  typically  infinite. 

Thus,  the  strategy  of  the  LSPA  system  b  that  if  we  can  create  a  domain  theory  (i.e.,  set  of  axioms)  we  can  then 
automate  important  parts  of  the  knowledge  engineering  process  for  creating  a  PA.  The  issue  that  we  are  now 
addressing  b  whether  we  can  successfully  scale-up  our  domcdn  theory  about  the  pilot-aiding  domain. 

A  full  domain  theory  should  be  integrated  across  the  different  PA  expert  systems.  Thb  wiU  require  different  perspec¬ 
tives  of  common  events,  equipment,  and  actions  for  the  different  modules  *.  Another  important  requirement,  which 
b  the  main  focus  of  thb  paper,  b  that  thb  integrated  domain  theory  should  abo  be  connected  to  the  aircraft  systems. 
The  domain  theory  will  necessarily  be  heavily  based  on  assumptions  about  the  aircraft  and  its  subsystems.  Aircraft 
are  frequently  retrofitted  with  new  equipment  and  capabilities.  Explicitly  modeling  functionality  dependencies  of 
the  aircraft  wUl  be  necessary  for  the  domain  theory  to  be  easily  updated  as  the  aircraft  evolves. 

1.2  Outline  of  paper 

In  the  remainder  of  thb  paper  we  will  report  on  preliminary  work  investigating  whether  an  AI  high-level  mission 
performance  model  can  usefully  connect  to  and  share  information  with  a  low-level  VHDL  model  of  hardware  compo¬ 
nents.  A  novel  aspect  of  thb  work  is  the  degree  of  separation  of  the  mission  performance  functions  from  the  bolated 
behavior  of  the  hardware  components.  Other  researchers  have  investigated  diagnostic  reasoning  using  VHDL  modeb. 
There  b  a  significant  gap,  however,  between  the  level  of  reasoning  that  b  done  in  such  diagnostic  settings  and  our 
mission  performance  setting.  The  functionality  of  the  diagnostic  context  b  typically  closely  connected  to  the  internal 
behavior  of  the  given  device.  In  our  domain  we  are  reasoning  about  functionality  of  hardware  within  a  much  broader 
context.  Any  given  piece  of  hardware  might  be  involved  in  many  different  mbsion  functions.  Representing  these 
connections  between  the  mission  functions  and  the  hardware  functions  b  a  significant  challenge  in  itself  because  there 
are  so  many  layers  and  interconnections. 

We  expect  that  representing  these  connections  should  have  significant  benefits  in  two  directions.  First,  there  should 
be  top-doutnbenefits  that  go  from  the  mbsion  performance  model  to  the  device  model.  An  appropriate  representation 
should  enable  us  to  develop  reasoning  procedures  that  can  derive  hardware  specifications  from  the  mbsion  perfor¬ 
mance  model.  Similar  reasoning  procedures  could  verify  that  a  given  hardware  design  can  actually  fulfill  mbsion 
performance  requirements,  or  at  least  systematically  generate  VHDL  simulation  testbenches  based  on  the  high-level 
mbsion  performance  model.  A  second  direction  for  benefits  will  be  boiiom-vp  from  VHDL  modeb  of  hardware  com¬ 
ponents  to  the  mbsion  performance  model.  These  benefits  will  include  instantiating  constraints  on  the  performance 
model  from  VHDL  models  of  the  hardware. 

In  the  rest  of  thb  paper  we  will  present  preliminary  work  in  these  three  areas.  We  will  first  present  an  example 
of  a  representation  that  is  well-suited  to  capturing  the  multiple  layers  and  interconnections  involved  in  relating  a 
high-level  mbsion  performance  model  to  low-level  VHDL  device  models.  We  will  then  present  examples  of  top-down 
and  bottom-up  connections  between  our  high-level  mbsion  performance  model  and  a  VHDL  model  of  a  graphics 
processor  for  cockpit  displays. 


2.  FUNCTIONAL  REPRESENTATION  AS  AN  INTEGRATIVE  GLUE 


Much  of  our  thinking  about  ordinary  devices  and  how  they  work  involves  viewing  those  objects  as  contributing  to 
various  causal  processes  that  make  the  device  realise  its  functions.  Causation  has  been  called  "the  cement  of  the 
universe,”  and  few  would  contest  the  centrality  of  causal  relations,  and  the  role  they  play  in  decomposing  systems 
into  subsystems,  components,  subfunctions  and  constituent  causal  processes. 

One  organbation  of  out  causal  knowledge  of  a  system,  especially  a  complex  system  such  as  a  pilot  flying  an  aircraft,  b 
from  overall  functionality  down  to  the  subsystems  with  their  subcomponents  and  their  contributing  functionalities. 
A  functional  decomposition  of  aircraft  flights  is  primarily  of  value  to  show  how  to  move  &om  higher  leveb  of 
functional  description  to  lower  leveb  of  causal  processes  that  realise  functions  of  interest.  A  decomposition  in  which 
the  functional  parts  are  both  represented  and  related  then  provides  the  "integrative  glue”  for  subsequent  problem 
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Device:  Owiukip 

Function:  SaLfe-from-Giound-Based-MUsile(PopupSamSite) 

If:  (Noticed  PopupSamSite  Obstacle  to  CuiientRoute  ) 

Prevent:  (SamSite  launch  at  ownship) 

Until:  (Past  PopupSamSite) 

Provided:  (And 

(No  authorisation  to  destroy  encountered  PopupSamSites) 

(No  directive  to  destroy  encountered  PopupSamSites)  ) 

By:  (  XOr 

Prevent-Track-PopupSamSite 
Break- Track- PopupSamSite 
Pievent-Fire-Control-Solution  ) 

Decomposition:  If  radar  in  search  mode  do  chzdf,  electronic  warfare  and/or  cloaking. 

If  in  track  break  track. 

If  in  fire-control  break  fire-control  and  (concurrently)  break-track 

Figure  1:  The  Safe-from-Ground-Based-Missile(PopupSamSite)  of  Ownship 

solving  across  the  levels.  If  we  want  to  know  how  an  equipment  change,  for  example,  might  effect  parameters 
governing  a  maneuver  for  accomplishing  some  flight  goal,  the  dependencies  marked  in  a  functional  decomposition 
can  show  us  what  to  look  for,  and  how  to  figure  out  what  has  changed. 

Functional  Representation^°'^  was  developed  as  a  representational  scheme  for  the  causal  processes  that  culmi¬ 
nate  in  the  achievement  of  device  functions.  Functional  Representation  (FR)  explains  how  a  device  function  is 
achieved.  The  function  of  the  overall  device  is  described  first  and  the  behavior  of  each  component  is  described 
in  terms  of  how  it  contributes  to  the  function.  FR  has  been  used  for  diagnosis,  design,  redesign,  explanation  and 
prediction*'**'^*’'^’'*''®'*®’*'’**.  Previous  investigations  have  also  considered  some  aspects  of  how  FR  can  assist  in 
machine  learning,  knowledge  acquisition  and  knowledge  compilation®®*®^’®'^. 

Functions  and  causal  processes  are  closely  related  to  goals  and  plans  for  achieving  the  goals.  This  suggests  that  FR 
should  be  a  natural  represention  for  pilot-aiding  domain  theory.  Since  FR  was  originally  developed  for  representing 
hardware  devices,  it  also  should  be  a  natural  representation  for  connecting  to  the  device  level  of  an  aircraA,  i.e.,  in 
this  case  VHDL  models.  The  remainder  of  this  section  presents  a  preliminary  set  of  functions  and  causal  process 
descriptions  (CPDs)  that  model  one  of  the  scenarios  from  the  LSPA  program.  The  purpose  of  this  model  is  to  give 
a  sense  of  the  representation  that  will  be  required. 

The  goal  of  the  scenario  we  will  consider  is  to  be  safe  from  a  surfisce-to-air  missile  threat.  This  goal,  or  desired  result, 
is  a  high-level  function  (of  a  tacit  aircraft-mission  device).  We  will  model  the  plans  associated  with  this  function  as 
causal  processes  to  achieve  or  maintain  this  result. 

In  the  pilot-aiding  domain,  it  is  quite  common  for  several  plans  to  be  available  for  a  given  goal-even  a  quite  specifically 
delineated  goal.  This  feature  means  that  to  model  the  goal-to-plan  relation  functionally,  the  usual  FR  formalism 
needs  to  be  augmented.  The  addition  of  Or,  Xor,  Concurrently  operators  is  one  addition  required  for  devices 
having  a  situation  dependent  reconfigurability  (most  obviously  in  those  devices  that  are  at  least  partially  controlled 
by  agents).  When  such  constructs  are  used,  it  is  useful  to  have  specific  Decomposition  Rules  to  indicate  what  CPD 
should  be  used  when.  We  will  illustrate  this  augmentation  to  FR  with  the  function  described  in  Figure  1. 

Figure  1  provides  a  function  frame  for  a  specific  goal  of  being  safe  from  ground-to-air  missile  threats.  A  “Popup- 
SamSite”  is  a  previously  unknown  ground-based  missile  site.  No  mission  planning  or  routing  information  would  have 
indicated  its  existence,  and  its  presence  is  then  an  unanticipated  hasard  of  some  mission.  We  are  presuming  that  the 
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X 


PopupSamSite(x)  on  route  in  track-mode  exists. 


By  sensors,  processors  and  displaygenerators 


Noticed  PopUpSamSite  in  Track-mode  (pilot) 


Using  Function  Break-Track  of  OwnShip 


NoTrackMode 


Figure  2:  CPD:  Break-Track-PopupSamSite 


semantics  of  Safety  goals  are  linked  to  mainly  defensive  measures;  the  functionality  of  being  a  defensive  safety  goal 
is  indicated  in  the  Provided  field  that  suggests  that  no  preemptive  destructive  goals  are  appropriate  or  warranted. 
In  general,  the  Provided  fields  are  used  to  indicate  the  Standard  Operating  Conditions  under  which  the  current 
CPDs  are  assumed  to  work.  The  semantics  of  these  fields  are  especially  important  for  deciding  that  a  malfunction 
exists.  This  is  because  the  truth  of  Provided  fields  are  presuppositions  for  raising  the  question  of  whether  and  how 
a  malfunction  has  occurred. 

The  resultant  state  that  “defines”  the  Safe-from-Ground-Based-Missile(PopupSamSite)  of  Ownship  Function  is  that 
a  launch  of  the  PopupSamSite  is  prevented.  Preventing  a  launch  is  essentially  a  matter  of  denying  the  Sam-Site  a 
means  of  obtaining  a  state  that  enables  a  launch.  In  this  illustration,  the  focus  is  on  those  causal  processes  that 
block  a  launch-capable  state  by  denying  appropriate  radar  information. 

There  is  no  unique  causal  process  that  invariably  is  appropriate  for  preventing  the  PopupSamSite  launch.  The 
prevention  functionality  is  an  abstraction  over  a  potential  range  of  causal  processes,  any  of  which  can  prevent  launch. 
Unlike  simple  devices  where  a  given  function  depends  on  some  fixed  causal  process  for  its  realisation,  a  device 
involving  an  agent  operating  in  an  uncertain  environment  typically  needs  to  be  represented  as  having  a  capability 
to  exercise  a  range  of  optional  processes.  In  the  current  illustration,  for  example,  different  actions  are  appropriate 
depending  on  the  state  of  the  radar  of  the  PopupSamSite.  We  will  imagine  that  radars  are,  if  working,  in  one  of  three 
mutually  exclusive  modes  with  respect  to  us;  search,  track,  or  fire-control-solution  mode.  The  primuy  differences 
concern  the  degree  to  which  information  has  permitted  the  radar  installation  to  notice  and  predict  ownship’s  (our 
aircraft’s)  path.  Thus  different  actions  will  be  appropriate  for  different  modes.  For  example,  if  the  radar  is  in  search 
mode,  we  desire  it  to  remain  so  and  only  engage  in  measures  that  will  impede  the  radar  obtaining  a  track. 

In  this  illustration,  we  will  not  pursue  a  breadth  first  expansion  of  all  possible  CPD  expansions,  but  will  instesul 
imagine  a  traversal  of  our  model  that  proceeds  to  greater  and  greater  levels  of  detail.  We  next  consider  the  CPD 
of  Break-Track-PopupSamSite,  which  provides  one  way  to  prevent  the  PopupSamSite  launch,  and  thus  achieve  the 
safety  goal. 

In  Figure  2,  we  present  the  next  layer  of  decomposition  in  a  CPD  for  Break- Track- PopupSamSite.  The  initial  causal 
state  in  this  process  is  that  there  is  a  PopupSamSite  that  is  in  track  mode.  Then  by  a  (quite  complex)  causal  process 
that  we  will  not  expand  here,  sensors  detect  signals  that  are  processed  and  made  available  to  display  generators  that 
illuminate  surfaces  for  the  agent  to  sample.  This  sampling  process  by  a  pilot  normally  results  in  the  pilot  notkiag 
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Device:  Owiuhip 
Function:  Break- Ttack 

If:  (PopupSamSite(Locationldentifiei)  in  track  mode) 

Makes:  (PopupSamSite(Locationldentiiier)  in  search  mode  ) 

By:  (Or  Corutant-Distance-Behavior  ElectronicMeasurea) 

Decomposition:  If  Type(PopupSamSite  1)  then  Constant-Distance-Behavior. 

Figure  3:  Break  Track  Function 


NO  A  not  equal  to  90  degrees 

Using  sensor-signalprocessor-displsygenerator  functions  of  Ownship 

Pilot 

Notices  NOA  not  equal  to  90  degrees 

Using  Function  Make-NO A- Equal-90  of  OwnShip 

( 

NOA  equal  to  90  degrees  I 

Using  Function  Maintain-NOA-Equal-90  of  Ownship 

c 

SamSite  in  search  mode 

Figure  4;  CPD:  Constant-Distance-Behavior(fixedground) 


that  the  PopupSamSite  is  in  track  mode.  This  awareness  of  the  pilot  then  leads  to  the  pilot's  using  a  function, 
Break-Track,  to  cause  the  result  state  of  the  PopupSamSite  losing  track  mode. 

One  virtue  of  the  FR  and  CPD  modeling  approach  is  that  we  are  able  to  skip  over  some  processes  for  which  we  do 
not  have  models  at  a  complete  level  of  detail.  We  do  not  know  the  precise  retina]  tuid  cortical  events  leading  to 
pilot  registration  of  a  situation,  for  example.  Of  course,  we  may  know  some  behavioral  parameters  for  this  process 
(such  as  average  response  time).  Within  the  FR  and  CPD  approach,  such  variables  of  interest,  and  equations  for 
computing  them  when  known,  are  normally  attached  to  the  links  and  state  nodes  that  are  marked  by  the  relevant 
qualitative  states  and  transitions. 

In  Figure  3,  we  provide  a  function  frame  for  the  Break-Track  function  that  we  have  just  mentioned.  To  aid  in 
seeing  how  a  functional  decomposition  is  produced,  it  is  useful  at  this  point  to  further  describe  our  domain.  The 
specific  leaf  plan  of  a  possible  PA  plan-goal  graph  that  we  are  capturing  involves  a  safety  plan,  called  a  doppler 
notch  maneuver,  that  works  in  the  following  way.  We  suppose  there  is  a  particular  ground  based  missOe  and  radar 
configuration  that  employs  a  doppler  radar  that  can  be  “tricked”  by  flying  in  a  circle  around  the  site  and  maintaining 
constant  distance  from  the  radar  installation.  We  now  wish  to  translate  this  maneuver  into  the  underlying  causal 
processes  that  carry  it  out.  We  next  expand  our  representation  preferentially  to  consider  the  decomposition  of  the 
Constant-Distance-Behavior  CPD  indexed  in  the  By  slot  of  Figure  3. 

Figure  4  shows  the  expansion  of  the  pilot  behavior  that  goes  into  the  doppler  notch  maneuver.  To  achieve  a  constant 
distance  effect,  the  pilot  thinks  of  flying  in  an  arc  that  maintains  the  radar  base  at  an  angle  of  ninety  degrees  olT 
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Device:  Ownship 

Function:  Maintun-NOA-Equal-90 
If:  (NOA  within  90  +/-  7.5  degtees) 

Maintains:  (NOA  within  90  -f /-  7.5  degrees) 

By:  (PiecisionFlxed) 

Decomposition  knowledge  comments: 

Figure  5:  Function  Maintain-NOA-Equal-90(PopupSamSite)  of  Ownship 


Figure  6:  CPD:  Precision-Fixed-Maneuver 


the  line  through  the  plane’s  nose  through  its  tail.  We  refer  to  this  orientation  mode  as  NOA,  for  nose-oif-angle.  The 
maneuver  is  decomposable  into  a  state  of  being  in  a  NOA  unequal  to  ninety  degrees  leading  to  a  state  of  noticing 
that  fact,  followed  by  making  and  then  maintaining  the  NOA  equal  to  ninety  degrees. 

Figure  5  represents  the  function  frame  for  the  function  Maintain-NOA- Equal-90.  The  If  slot  states  that  this  function 
is  only  appropriate  when  ownship  is  already  close  to  a  NOA  of  90  degrees  (e.g.,  -(-/-  7.5  degrees).  The  By  slot  states 
that  this  function  is  accomplished  by  a  class  of  maneuver  that  we  call  a  PrecisionFixed  maneuver.  Such  a  maneuver 
is  typically  a  very  deliberate  and  patiently  executed  maneuver  with  a  concern  for  steady  and  exacting  tolerances. 

Figure  6  further  decomposes  the  PrecisionFixed  type  of  maneuver.  Such  a  maneuver  starts  out  in  a  state  of  knowing 
information  such  as  the  direction  of  the  maneuver,  the  points  at  which  to  initiate  and  terminate  the  maneuver,  and 
how  much  or  how  strong  of  a  turn  the  maneuver  will  involve.  The  pilot  then  must  monitor  for  the  intitiation  point 
and  start  the  maneuver  upon  reaching  that  point.  The  pilot  initiates  and  then  maintains  the  maneuver  by  making 
precise  movements  of  the  stick  until  he  nears  the  termination  point,  at  which  time  he  eases  out  of  the  turn  by  another 
precision  stick  movement. 

Figure  7  describes  the  function  of  making  a  precision  stick  movement.  This  function  requires  that  the  stick  must 
initially  be  close  to  the  desired  position.  It  also  requires  that  the  pilot  has  perceptual  feedback  at  a  suffidently 
fine-grained  and  accurate  level  to  permit  the  precision  maneuver.  For  example,  if  NOA  were  only  disidayed  in  10 
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Device:  Stick 

Function:  Precision-ttick-niovenient 
If:  cloM  to  deaired  "direction" 

Makes:  equal  to  desired  "direction” 

Provided:  accuracy  of  instrumentation  within  range  of  "close" 

By:  small  gradual  movement  of  stick  guided  by  pilot’s  perceptual-motor  skills 
Decomposition  knowledge  comments: 

Figure  7:  Function  Precision  Stick  Movement  Function 

degree  increments  then  it  will  be  difficult  for  the  pilot  to  make  the  turn  within  a  tolerance  of  a  few  degrees.  Similarly, 
the  same  problem  will  arise  if  the  NO  A  information  is  only  accurate  to  within  10  degrees. 

We  have  now  reached  a  level  where  we  can  begin  to  connect  our  mission  performance  model  to  hardware  components 
and  VHDL  models  of  those  hardware  components.  For  example,  this  function  could  easily  include  pointers  to 
hardware  components  that  provide  information  on  NO  A.  In  this  case  such  information  would  probably  come  from 
a  radar  warning  reciever.  We  could  use  this  connection  in  two  general  ways.  One  use  would  be  to  specify  the 
accuracy  required  in  NO  A  readings  by  this  maneuver.  This  would  be  deriving  a  hardware  specification  from  the 
high  level  mission-performance  model.  A  second  type  of  use  is  an  inverse  of  the  first.  This  would  be  to  constrain  the 
performance  model  by  available  accuracy  limits  of  different  instruments-such  as  the  radar  warning  reciever  in  this 
case. 


AND  DEVICE  IMPLEMENTATION  LEVELS 


The  LSPA  program  was  divided  into  two  parts:  the  Learning  System  for  Tactical  Plans  (LSTP)  which  used  machine 
learning  techniques  to  acquire  a  new  plan  by  observing  a  pilot-flown  example,  and  the  Learning  System  for  Information 
Requirements  (LSIR)  which  took  the  newly-learned  plan  as  input  and  generated  a  set  of  information  elements  a 
pilot  would  require  in  order  to  perform  the  plan.  The  Pilot- Vehicle  Interface  module  of  the  PA  uses  the  lists  of 
information  elements  generated  by  LSIR  to  dynamically  select  a  set  of  display  formats  for  presentation  in  the  cockpit 
based  on  the  set  of  active  plans.  LSIR  uses  several  paramters  to  describe  and  reason  about  each  information  element. 
Two  examples  are  respontiveness  and  task  rate-of-ckange.  The  responsiveness  parameter  indicates  how  quickly  an 
information  element  changes  value  in  response  to  a  query  or  command  for  a  change.  Task  rate-of-change  refers  to 
how  often  the  values  of  a  given  information  element  tend  to  change. 

Even  though  the  parameters  of  LSIR  information  elements  are  at  a  very  detailed  grain  sise  relative  to  tactical  plans, 
they  involve  information  at  a  quite  large  grain  sise  relative  to  most  existing  VHDL  models.  One  of  the  authors, 
however,  has  been  a  principal  designer  and  implementor  of  a  model  using  VHDL  for  systems  performance  and 
architectural  evaluation  of  the  next-generation  cockpit  display  generator.  This  VHDL  model,  known  as  the  Cockpit 
Display  Generator  (CDG),  was  sponsored  by  the  Wright  Laboratories  Cockpit  Technology  Directorate.  It  is  being 
driven  primarily  by  the  development  of  next-generation  rotary  wing,  transport  and  fighter  aircraft. 

CDG  is  the  only  existing  VHDL  model  we  know  that  comes  close  to  describing  avionics  devices  at  an  appropriate 
level  for  these  LSIR  parameters.  Because  of  the  availability  of  CDG,  we  have  focused  our  analyses  on  information 
parameter  values  that  are  entirely  controlled  by  the  graphics  processor — primarily  those  which  are  aspects  of  the 
displays  themselves.  One  such  information  element  is  the  range  setting  of  the  TUctical  Situation  Display  (e.g.,  5.  10, 
20, 40,  or  60  nautical  miles).  The  responsiveness  and  task  rate-of-change  parameters  for  the  Tactical  SitnatioB  Display 
range  setting  are  only  affected  by  the  performance  of  the  graphics  processor  and  the  display  surfaces  themselves. 

Knowledge  about  latency  and  throughput  values  for  information  requirements  that  ate  localised  in  the  graphics 
processor  are  readily  available  via  simulation  of  the  CDG  models.  Latency  values  can  be  directly  used  for  the  LSIR 
responsiveness  parameter.  The  task  rate-of-change  parameter  can  easily  be  calculated  from  the  VHDL  throughput 
values.  Although  these  two  parameters  are  a  limited  portion  of  the  total  number  of  parameters  that  LSIR  requires. 
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our  analytes  indicate  that  we  should  be  able  to  obtain  a  substantial  proportion  of  the  total  number  of  parameters 
once  more  detailed  VHDL  models  become  available.  We  expect  that  VHDL  models  of  system  level  devices  will 
become  relatively  common  over  the  next  10  to  15  years^*. 

A  major  limitation  of  the  present  CDG  model  is  that  it  is  a  VHDL  performance  model.  That  is,  it  only  models 
processing  times.  This  is  in  contrast  to  VHDL  /uncttona/ models  which  will  represent  and  reason  about  the  values  that 
are  produced  in  addition  to  simply  modeling  how  fast  inputs  are  processed.  However,  even  with  pure  performance 
models,  latency  and  throughput  values  can  be  derived  for  information  requirements  not  exclusively  associated  with 
the  graphics  processor.  This  can  be  accomplished  by  ’’stringing”  VHDL  models  together.  For  example,  the  latency 
associated  with  a  commanded  change  to  the  mode  of  the  aircraft’s  FLIR  will  be  the  sum  of  the  latencies  from  the 
pilot’s  button  press,  through  the  mission  computer,  to  the  FLIR  itself,  then  from  the  FLIR  to  the  mission  computer 
to  the  graphics  processor  to  the  display.  The  throughput  for  value  changes  will  be  the  limiting  throughput  along  the 
same  path. 


4.  TOP-DOWN  EXAMPLE  OF  BRIDGING  MISSION  PERFORMANCE 
AND  DEVICE  IMPLEMENTATION  LEVELS 


The  integrated  representation  we  presented  in  section  2  was  helpful  in  organising  our  thinking  about  plans,  infor¬ 
mation  elements,  and  their  connection  to  the  VHDL  device  models.  This  representation,  however,  is  not  strictly 
necessary  for  implementing  the  bottom-up  use  of  the  CDG  model  we  described  for  providing  parameters  on  LSIR 
information  elements.  In  this  section  we  describe  another  possible  connection  between  the  mission  performance 
systems  and  device  level  descriptions  in  VHDL.  We  refer  to  this  as  a  top-down  connection  because  we  propose  to 
use  the  high  level  mission  performance  model  to  derive  parameters  for  use  in  the  VHDL  model.  In  particular,  we 
suggest  an  approach  to  generating  testbenches  for  VHDL  simulations. 

Note  that  the  VHDL  simulations  we  described  above  need  to  be  sensitive  to  the  relevant  platu  with  which  the 
information  elements  are  associated.  For  example,  task  rate-of-change  of  a  velocity  display  (how  often  the  values  of 
the  display  change)  will  be  very  different  for  maneuvers  that  involve  fast  accelerations  versus  maneuvers  that  only 
involve  cruising  at  constant  velocities.  Such  plan  dependencies  require  that  the  simulation  include  the  relative  load 
levels  on  the  graphics  processor  imposed  by  other  activities  which  are  known  to  be  ongoing  during  the  plan.  For 
example,  during  a  Doppler  Notch  maneuver,  high-g  turns  mean  that  the  digital  map  equipment  on  the  aircraft  will 
be  required  to  go  through  a  large  amount  of  terrain  data  and  be  updated  frequently.  This  implies  a  relatively  high 
load  on  the  graphics  processor  compared  to  most  other  plans. 

A  simulation  of  the  graphics  processor  load  might  be  determined  in  accordance  with  the  following  scheme  for  au¬ 
tomatic  generation  of  a  testbench  of  the  graphics  processor  avionics  subsystem.  First,  instantiate  a  high  level  plan 
for  a  given  situation.  This  could  easily  be  done  by  using  flight  simulators  such  as  were  used  on  the  PA  and  LSPA 
programs.  For  each  plan  that  is  operative  in  the  generated  situation,  (and  perhaps  a  background  load  that  is  typical 
for  the  system)  a  list  of  display-relevant  events  must  be  generated.  Rules  for  inferring  such  an  event  list  already  exist 
as  part  of  the  LSIR  system.  New  rules  must  be  added  that  will  translate  these  event  lists  into  data  flow  estimates 
for  the  VHDL  model.  These  estimates  will  provide  the  initial  conditions  for  the  simulation  run  of  the  VHDL  model. 

The  Functional  Representation  framework  would  be  used  to  to  control  the  computation  and  temporal  ordering  of 
the  data  flow  estimates.  The  idea  is  to  attach  methods  to  the  Functional  Representation  that  compute  data  flow 
estimates.  These  computations  will  depend  on  their  location  in  the  representation  and  iiutantiatioiu  of  variables  for 
activated  plans.  A  traversal  of  the  representation  that  respects  the  causal  transition  order  should  produce  the  proper 
temporal  ordering  on  the  data  flow  specifications. 


5.  CONCLUSIONS  AND  FUTURE  WORK 


We  have  reported  preliminary  work  investigating  whether  an  AI  high-level  mission  performance  model  can  naefnlly 
connect  to  and  share  information  with  a  low-level  VHDL  model  of  hardware  components.  To  date,  our  work  has 
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mainly  been  an  analysis  of  the  problem  based  on  two  AI  high-level  mission  performance  models,  the  Pilot’s  Associate 
and  Learning  Systems  for  Pilot  Aiding  systems,  and  a  system  level  VHDL  model  of  a  graphics  display  processor, 
the  CDG  model.  In  this  paper  we  have  sketched  an  example  of  a  connection  between  such  systenu  using  Functional 
Representation.  We  have  also  described  an  example  in  which  information  for  the  mission  performance  model  is 
obtained  from  a  VHDL  device  model  and  another  example  in  which  testbench  parameters  for  the  VHDL  model  can 
be  generated  from  the  mission  performance  model. 

These  examples  are  all  in  early  stages  of  development.  The  examples  of  reasoning  between  the  two  types  of  systems 
presently  only  make  allusions  to  using  the  Functional  Representation  that  we  described.  The  value  of  this  integrated 
representation  has  so  far  primarily  been  in  organising  our  conceptualisation  of  the  problem  domain.  We  believe, 
however,  that  the  only  notion  broad  enough  to  encompass  all  connections  of  hardware,  software,  human  behavior 
and  external  world  environment  at  the  mission  performance  level  is  that  of  functionality.  Thus,  we  anticipate 
that  such  a  representation  will  become  increasingly  crucial  to  usefully  bridging  the  mission  performance  and  device 
implementation  levels  as  this  work  proceeds. 

In  this  paper  we  have  defined  our  problem  and  approach  and  have  sketched  a  representation  for  a  small  example. 
We  have  also  given  two  examples  of  potential  uses  of  this  representation.  Our  next  steps  will  include  implementing 
examples  of  these  top-down  and  bottom-up  uses.  Our  future  plans  also  include  scaling-up  our  representations  to 
handle  more  scenarios  and  the  different  perspectives  that  will  be  required  by  the  different  modules  such  as  situation 
assessment  and  system  status. 
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